Methods and compositions for cannabis characterization

ABSTRACT

Provided are methods for determining if a  cannabis  sample comprises hemp or marijuana, or  Cannabis sativa  and/or  Cannabis indica  as well as primers and kits for use in the methods.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a national phase entry of PCT/CA2016/050678, filedJun. 13, 2016, which claims priority from U.S. Provisional patentapplication Ser. No. 62/175,006 filed Jun. 12, 2015, each of theseapplications being incorporated herein in their entirety by reference.

INCORPORATION OF SEQUENCE LISTING

A computer readable form of the Sequence Listing“P48554US01_SequenceListing” (161,208 bytes), submitted via EFS-WEB andcreated on Dec. 12, 2017, is herein incorporated by reference.

FIELD

The present disclosure provides methods, compositions and kits forcharacterizing cannabis samples. The present disclosure also providesmethod, compositions and kits for distinguishing Cannabis sativa fromCannabis indica, and marijuana from hemp as well as measuringcontribution of Cannabis sativa and Cannabis indica in marijuana.

BACKGROUND

Cannabis is one of humanity's oldest crops, with records of use datingto 6000 years before present. It is used as a source of high-qualitybast fibre, nutritious and oil-rich seeds and for the production ofcannabinoid compounds including delta-9 tetrahydrocannabinol (THC) andcannabidiol (CBD). The evolutionary history and taxonomy of Cannabisremains poorly understood. Hillig (2005) proposed that the genusCannabis consists of three species (C. sativa, C. indica, and C.ruderalis) [1], whereas an alternative viewpoint is that Cannabis ismonotypic and that observable subpopulations represent subspecies of C.sativa: C. sativa subspecies sativa, C. sativa subspecies indica and C.sativa subspecies ruderalis [2]. The putative ruderalis type mayrepresent feral populations of the other types or those adapted tonorthern regions. The classification of Cannabis populations isconfounded by many cultural factors, and tracing the history of a plantthat has seen wide geographic dispersal and artificial selection byhumans over thousands of years has proven difficult. Many hemp typeshave varietal names while marijuana types lack an organizedhorticultural registration system and are referred to as strains. Thedraft genome and transcriptome of C. sativa were published in 2011 [3].As both public opinion and legislation in many countries shifts towardsrecognizing Cannabis as a plant of medical and agricultural value [4],the genetic characterization of marijuana and hemp becomes increasinglyimportant for both clinical research and crop improvement efforts.

Differences between Cannabis sativa and Cannabis indica have beenreported.

Although the taxonomy of the genus Cannabis remains unclear, manybreeders, growers and users (patients) consuming cannabis for itspsychoactive and/or medicinal properties differentiate Sativa-type fromIndica-type plants.

Hillig & Mahlberg (2004) [20] have reported that mean THC levels and thefrequency of the THCA synthase gene (B_(T) allele) were significantlyhigher in C. indica than C. sativa. Plants with relatively high levelsof tetrahydrocannabivarin (THCV) and/or cannabidivarin (CBDV) werecommon only in C. indica.

Hazekamp & Fischedick (2011) [10] summarized differences between typicalSativa and Indica effects upon smoking. As a result of limitedunderstanding and support from the medical community, they indicate thatmedicinal users of cannabis generally adopt the terminology derived fromrecreational users to describe the therapeutic effects they experience.

They report that the psychoactive effects (the “high”) from Sativa-typeplants are often characterized as uplifting and energetic. The effectsare mostly cerebral (head-high), and are also described as spacey orhallucinogenic. Sativa is considered as providing pain relief forcertain symptoms. In contrast, the high from Indica-type plants is mostoften described as a pleasant body buzz (body-high or body stone).Indicas are primarily enjoyed for relaxation, stress relief, and for anoverall sense of calm and serenity and are supposedly effective foroverall body pain relief and in the treatment of insomnia.

They reported that the most common way currently used to classifycannabis cultivars is through plant morphology (phenotype) withIndica-type plants smaller in height with broader leaves, whileSativa-type plants taller with long, narrow leaves. Indica-type plantstypically mature faster than Sativa-type plants under similarconditions, and the types tend to have a different smell, perhapsreflecting a different profile of terpenoids.

There remains a need for more accurate classification of cannabis formedicinal and other commercial purposes.

SUMMARY

Using 14,031 single-nucleotide polymorphisms (SNPs) genotyped in 81marijuana and 43 hemp samples, marijuana and hemp are found to besignificantly differentiated at a genome-wide level, demonstrating thatthe distinction between these populations is not limited to genesunderlying THC production.

In addition, using additional SNPs including a second set of 9123 SNPsgenotyped in 37 reported Cannabis indica and 63 reported Cannabis sativasamples, ancestry determinations could be made which can be used forexample for selecting breeding partners.

Other features and advantages of the present disclosure will becomeapparent from the following detailed description. It should beunderstood, however, that the detailed description and the specificexamples while indicating preferred embodiments of the disclosure aregiven by way of illustration only, since various changes andmodifications within the spirit and scope of the disclosure will becomeapparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

An embodiment of the present disclosure will now be described inrelation to the drawings in which:

FIG. 1. Genetic structure of marijuana and hemp. (a) PrincipalComponents Analysis (PCA) plot of 42 hemp and 80 marijuana samples using14,031 SNPs. Hemp samples are closed circles and marijuana samples areopen circles. The proportion of the variance explained by each PrincipalComponent (PC) is shown in parentheses along each axis. The two sampleslabeled with their IDs are discussed in the text. (b) Boxplots showingsignificantly lower heterozygosity in marijuana than in hemp. (c)Population structure of hemp and marijuana estimated using thefastSTRUCTURE admixture model at K=2. Each sample is represented by athin vertical line, which is partitioned into two colored segments thatrepresent the sample's estimated membership in each of the two inferredclusters. Hemp and marijuana samples are labeled below the plot.

FIG. 2. Genetic structure of marijuana. (a) PCA plot of 81 marijuanasamples using 9,776 SNPs. Samples are shaded according to their reportedC. sativa ancestry. The proportion of the variance explained by each PCis shown in parentheses along each axis. (b) Population structure ofmarijuana calculated using the fastSTRUCTURE admixture model at K=2.Each sample is represented by a horizontal bar, which is partitionedinto two segments that represent the sample's estimated membership ineach of the two inferred clusters. Adjacent to each bar is the sample'sname and reported % C. sativa ancestry. (c) The correlation between theprincipal axis of genetic structure (PC1) in marijuana and reported C.sativa ancestry.

FIG. 3. Distribution of FST between marijuana and hemp samples across14,031 SNPs. (a) FST distribution for all SNPs genotyped. (b)Distribution of SNPs with FST greater than 0.5. Average FST is weightedby allele frequency and was calculated according to equation 10 in Weirand Cockerham (1984) [19].

FIG. 4. Mean pairwise Identity by State (IBS) between each marijuanasample and all hemp samples versus reported C. sativa ancestry.

FIG. 5. Example PCA of 81 marijuana strains using 9776 SNPs.

FIG. 6. Example distribution of per-SNP F_(ST) values between 9 presumedC. indica and 9 presumed C. sativa strains.

FIG. 7. Example evaluation of panels of ancestry informative markers(AIMS). Accuracy is defined here as the correlation between thepositions of non-ancestral samples along PC1 calculated using 9766 SNPsand the positions calculated using a given subset of AIMs.

FIG. 8. Example PCA of 100 marijuana strains using 9123 SNPs.

DETAILED DESCRIPTION OF THE DISCLOSURE

The term “cannabis reference” as used herein means a cannabis strain,species (e.g. sativa or indica) (also referred to as subspecies (e.g.sativa or indica)) or type (marijuana or hemp) with at least some knowngenotype profile information which is used as a reference comparison toa test sample, optionally wherein the genotype and/or allele frequencyof at least 10 SNPs in Table 4, 5 and/or 8 are known, optionally all ofthe SNPs in any one of Tables 4, 5 and/or 8. The cannabis reference canbe a Cannabis sativa reference, Cannabis indica reference, marijuanareference or hemp reference or a reference profile of any of theforegoing.

The term “Cannabis sativa reference” and “Cannabis indica reference” asused herein mean respectively, a selected Cannabis sativa strain orCannabis indica strain which is used as a reference for comparisonand/or genotype information of such a strain or genotype informationassociated with the particular Cannabis sativa or Cannabis indicareference strain e.g. a reference profile for a particular strain or areference profile associated with the species. The reference profilecomprises at least 10 known SNPs (e.g. genotype and optionallyfrequency) in Table 4 and/or Table 8, optionally all of the SNPs inTable 4 and/or 8 found in the particular Cannabis sativa or Cannabisindica strain respectively or a composite of strains of the particularspecies. The Cannabis sativa reference or Cannabis indica reference caninclude in addition to the predominant allele in the species or aparticular strain of the species the frequency of the SNP allele in thepopulation.

The term “cannabis reference profile” as used herein means genotypeinformation of one (e.g. a particular strain) or plurality of cannabisstrains and/or species, including Cannabis sativa and/or Cannabis indicastrains or marijuana and/or hemp strains, and includes the genotype ofat least 10 SNPs in Table 4, 5 and/or 8, optionally all of the SNPs inTable 4, 5 and/or 8. A Cannabis sativa reference profile as used hereinmeans genotype information of a plurality of cannabis strains andincludes genotype sequence (and optionally including frequencyinformation) associated with Cannabis sativa strains and a Cannabisindica reference profile as used herein means genotype information of aplurality of cannabis strains and includes genotype sequence (andoptionally including frequency information) associated with Cannabisindica strains.

The term “marijuana” as used herein denotes cannabis plants and plantparts that are cultivated and consumed as a drug or medicine. Marijuanaoften contains high amounts of psychoactive cannabinoids such astetrahydrocannabinolic acid (THCA) and delta-9 tetrahydrocannabinol(THC) but it may also contain cannabidiolic acid (CBDA) and cannabidiol(CBD). For example, marijuana can be defined as cannabis plants andplant parts wherein the leaves and flowering heads of contain more than0.3% w/w, 0.4% w/w or 0.5% w/w of delta-9-tetrahydrocannabinol (THC)(dry weight). The term “hemp” as used herein denotes cannabis plantsthat are cultivated and used for the production of fibre or seeds ratherthan as drug or medicine. Often hemp plants often contain high amountsof CBDA and CBD, and low amounts of THCA and THC. For example, hemp canbe defined as cannabis plants and plant parts wherein the leaves andflowering heads of which do not contain more than 0.3% w/w, 0.4% w/w or0.5% w/w of delta-9-tetrahydrocannabinol (THC) (dry weight).

The term “polynucleotide”, “nucleic acid”, “nucleic acid molecule”and/or “oligonucleotide” as used herein refers to a sequence ofnucleotide or nucleoside monomers consisting of naturally occurringand/or modified bases, sugars, and intersugar (backbone) linkages, andis intended to include DNA and RNA which can be either double strandedor single stranded, representing the sense or antisense strand.

As used herein, the term “isolated nucleic acid molecule” refers to anucleic acid substantially free of cellular material or culture mediumwhen produced by recombinant DNA techniques, or chemical precursors, orother chemicals when chemically synthesized. The term “nucleic acid” isintended to include DNA and RNA and can be either double stranded orsingle stranded.

The term “primer” as used herein refers to a nucleic acid molecule,whether occurring naturally as in a purified restriction digest orproduced synthetically, which is capable of acting as a point ofsynthesis of when placed under conditions in which synthesis of a primerextension product, which is complementary to a nucleic acid strand isinduced (e.g. in the presence of nucleotides and an inducing agent suchas DNA polymerase and at a suitable temperature and pH). The primer mustbe sufficiently long to prime the synthesis of the desired extensionproduct in the presence of the inducing agent. The exact length of theprimer will depend upon factors, including temperature, sequences of theprimer and the methods used. A primer typically contains 15-25 or morenucleotides, although it can contain less, for example 10 nucleotides.The factors involved in determining the appropriate length of primer arereadily known to one of ordinary skill in the art.

As used herein, the term “upstream primer” as used herein refers to aprimer that can hybridize to a DNA sequence and act as a point ofsynthesis upstream, or at a 5′, of a target polynucleotide sequence e.g.SNP, to produce a polynucleotide complementary to the targetpolynucleotide anti-sense strand. The term “downstream primer” as usedherein refers to a primer that can hybridize to a polynucleotidesequence and act as a point of synthesis downstream, or at a 3′ end, ofa target polynucleotide sequence, to produce a polynucleotidecomplementary to the target polynucleotide sense strand.

The term “probe” as used herein refers to a polynucleotide(interchangeably used with nucleic acid) that comprises a sequence ofnucleotides that will hybridize specifically to a target nucleic acidsequence. For example the probe comprises at least 18 or more bases ornucleotides that are complementary and hybridize to contiguous basesand/or nucleotides in the target nucleic acid sequence. The length ofprobe depends on the hybridization conditions and the sequences of theprobe and nucleic acid target sequence and can for example be 10-20,21-70, 71-100 or more bases or nucleotides in length. The probes canoptionally be fixed to a solid support such as an array chip or amicroarray chip. For example, the PCR product produced with the primerscould be used as a probe. The PCR product can be for example besubcloned into a vector and optionally digested and used as a probe.

The term “reverse complement” or “reverse complementary”, when referringto a polynucleotide, as used herein refers to a polynucleotidecomprising a sequence that is complementary to a DNA in terms ofbase-pairing and which is reversed so oriented from the 5′ to 3′direction.

As used herein, the term “kit” refers to a collection of products thatare used to perform a reaction, procedure, or synthesis, such as, forexample, a genotyping assay etc., which are typically shipped together,usually within a common packaging, to an end user.

The term “target allele” as used herein means an allele for a SNP listedin Table 4, 5 or 8.

The term “major allele” as used herein is the allele most commonlypresent in a population. The major allele listed in Tables 4, 5 and 8 isthe allele most commonly present in Cannabis sativa and Cannabis indicastrains (Tables 4 and 8) and marijuana and hemp strains (Table 5)respectively.

The term “minor allele” as used herein is the allele least commonlypresent in a population (e.g. C. sativa and C. indica or marijuana andhemp). The minor allele listed in Tables 4, 5 and 8 is present in thefrequency indicated therein.

A single-stranded nucleic acid molecule is “complementary” to anothersingle-stranded nucleic acid molecule when it can base-pair (hybridize)with all or a portion of the other nucleic acid molecule to form adouble helix (double-stranded nucleic acid molecule), based on theability of guanine (G) to base pair with cytosine (C) and adenine (A) tobase pair with thymine (T) or uridine (U).

The term “hybridize” as used herein refers to the sequence specificnon-covalent binding interaction with a complementary nucleic acid.

The term “selectively hybridize” as used herein refers to hybridizationunder moderately stringent or highly stringent physiological conditions,which can distinguish related nucleotide sequences from unrelatednucleotide sequences. In nucleic acid hybridization reactions, theconditions used to achieve a particular level of stringency are known tovary, depending on the nature of the nucleic acids being hybridized,including, for example, the length, degree of complementarity,nucleotide sequence composition (e.g., relative GC:AT content), andnucleic acid type, i.e., whether the oligonucleotide or the targetnucleic acid sequence is DNA or RNA. An additional consideration iswhether one of the nucleic acids is immobilized, for example, on afilter, bead, chip, or other solid matrix. Appropriate stringencyconditions which promote hybridization are known to those skilled in theart, or can be found in Current Protocols in Molecular Biology, JohnWiley & Sons, N.Y. (1989), 6.3.1 6.3.6 and/or Current Protocols inNucleic Acid Chemistry available athttp://onlinelibrary.wiley.com/browse/publications?type=lab protocols.

As used in this application, the words “comprising” (and any form ofcomprising, such as “comprise” and “comprises”), “having” (and any formof having, such as “have” and “has”), “including” (and any form ofincluding, such as “include” and “includes”) or “containing” (and anyform of containing, such as “contain” and “contains”), are inclusive oropen-ended and do not exclude additional, unrecited elements or processsteps.

As used in this application and claim(s), the word “consisting” and itsderivatives, are intended to be close ended terms that specify thepresence of stated features, elements, components, groups, integers,and/or steps, and also exclude the presence of other unstated features,elements, components, groups, integers and/or steps.

The term “consisting essentially of”, as used herein, is intended tospecify the presence of the stated features, elements, components,groups, integers, and/or steps as well as those that do not materiallyaffect the basic and novel characteristic(s) of these features,elements, components, groups, integers, and/or steps.

The terms “about”, “substantially” and “approximately” as used hereinmean a reasonable amount of deviation of the modified term such that theend result is not significantly changed. These terms of degree should beconstrued as including a deviation of at least ±5% of the modified termif this deviation would not negate the meaning of the word it modifies.

As used in this application, the singular forms “a”, “an” and “the”include plural references unless the content clearly dictates otherwise.For example, an embodiment including “a compound” should be understoodto present certain aspects with one compound or two or more additionalcompounds.

Further, the definitions and embodiments described in particularsections are intended to be applicable to other embodiments hereindescribed for which they are suitable as would be understood by a personskilled in the art. For example, in the following passages, differentaspects of the invention are defined in more detail. Each aspect sodefined may be combined with any other aspect or aspects unless clearlyindicated to the contrary. In particular, any feature indicated as beingpreferred or advantageous may be combined with any other feature orfeatures indicated as being preferred or advantageous.

III. Methods and Products

The present disclosure identifies for example a plurality of singlenucleotide polymorphisms (SNPs) ancestry informative markers (AIMs) thatcan be used to characterize cannabis samples. Cannabis samples can becharacterized for example according to their ancestral relatednessand/or whether the sample is likely marijuana or hemp. Accordingly, thepresent disclosure provides methods, nucleic acids, primers and kitsuseful for detecting whether a sample is Cannabis sativa dominant orCannabis indica dominant, for assessing the relatedness of a test sampleto Cannabis sativa and/or Cannabis indica reference samples as well asmethods, nucleic acids, primers and kits for distinguishing marijuanafrom hemp. Also provided are a computer implemented method, a computerprogram embodied on a computer readable medium, a system, apparatusand/or processor for carrying out a method or part thereof describedherein.

Embodiments of the methods and systems described herein may beimplemented in hardware or software, or a combination of both. Theseembodiments may be implemented in computer programs executing onprogrammable computers, each computer including at least one processor,a data storage system (including volatile memory or non-volatile memoryor other data storage elements or a combination thereof), and at leastone communication interface. For example, and without limitation, thevarious programmable computers may be a server, network appliance,set-top box, embedded device, computer expansion module, personalcomputer, laptop, mobile telephone, smartphone or any other computingdevice capable of being configured to carry out the methods describedherein.

The data storage system may comprise a database, such as on a datastorage element, in order to provide a database of Cannabis referencestrains, and/or reference profiles. Furthermore, computer instructionsmay be stored for configuring the processor to execute any of the stepsand algorithms described herein as a computer program.

Each program may be implemented in a high level procedural or objectoriented programming or scripting language, or both, to communicate witha computer system. However, alternatively the programs may beimplemented in assembly or machine language, if desired. The languagemay be a compiled or interpreted language. Each such computer programmay be stored on a non-transitory computer readable storage medium (e.g.read-only memory, magnetic disk, optical disc). The storage medium soconfigured causes a computer to operate in a specific and predefinedmanner to perform the functions described herein.

An aspect of the present method for detecting the presence or absence ofeach of a set of target alleles in a cannabis sample, the methodcomprising:

I) obtaining a test sample comprising genomic DNA, and

II) either

-   -   i) genotyping the test sample for a set of single nucleotide        polymorphisms (SNPs), the set comprising at least 10, 20, 30,        40, 48, 50, 60, 70, 80, 90, 96, 100 or any number between and        including 10-200 of the SNPs in Table 4 and/or 8, wherein each        SNP comprises a major allele and a minor allele as provided in        Table 4 and 8; and    -   ii) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample;    -   or    -   a) genotyping the test sample for a set of SNPs, the set at        least 10, 20, 30, 40, 48, 50, 60, 70, 80, 90, 96 or 100 of the        comprising the SNPS in Table 5, wherein each SNP comprises a        major allele and a minor allele as provided in Table 5; and    -   b) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample.

In an embodiment, the SNPS in Table 4 and/or 8 can be used to determinethe ancestral contribution of Cannabis sativa and/or Cannabis indica ina marijuana strain.

The step of obtaining a test sample comprising genomic DNA can beaccomplished, for example by taking the cannabis sample or an aliquotthereof for example if the cannabis sample is isolated genomic DNA, orcan comprise preparing an isolated genomic DNA from the cannabis sampleor a portion thereof.

The cannabis sample or the test sample (e.g. comprising at least aportion of the cannabis sample) is any cannabis sample comprisinggenomic DNA. The sample can be isolated genomic DNA or a portion of aplant and/or seed comprising genomic DNA and optionally from whichgenomic DNA can be isolated. For example, the test sample can be a plantsample, a seed sample, a leaf sample, a flower sample, a trichomesample, a pollen sample a sample of dried plant material including leaf,flower, pollen and/or trichomes, or a sample produced through in vitrotissue or cell culture. Genomic DNA can be isolated using a number oftechniques such as NaOH extraction, phenol/chloroform extraction, DNAextraction systems such as Qiagen Direct PCR DNA Extraction System(Cedarlane, Burlington ON). In some embodiments, genomic DNA is notpurified prior to genotyping. For example, with the Phire Plant DirectPCR Kit the DNA target can be used to detect SNP alleles without priorDNA extraction (Life Technologies, Burlington ON).

In an embodiment, the set of target alleles which are detected are aplurality of SNPs in Tables 4, 5 and/or 8. Tables 4 and 8 each list 100SNPs, including a major allele and a minor allele and the minor allelefrequency in Cannabis sativa strains and Cannabis indica strains. Table5 lists 100 SNPs including a major allele and a minor allele and theminor allele frequency in marijuana and hemp. Also described in theseTables is the SNP position in the canSat3 C. sativa reference genomeassembly which is described in van Bakel et al [3], identified as theSNP name. The genome build assembly is identified by the number 3 forSNPs defined by SEQ ID NOs:1-400 (CanSat3) and the number 5 for SNPsdefined by SEQ ID Nos: 401-600 (CanSat5). Tables 6, 7 and 9 alsoidentify the upstream+SNP and downstream sequences associated with eachSNP. A person skilled in the art would understand that genomic DNA isdouble stranded and that the complementary nucleotide on the reversestrand can also be detected based on the complementary base pairingrules.

Genotyping the cannabis sample at the loci listed for example in Tables4, 5 and 8 can be accomplished by various methods and platforms.

In an embodiment, the step of genotyping comprises sequencing genomicDNA for example using a genotyping by sequencing (GBS) method. GBS istypically a multiplexed approach involving tagging randomly sheared DNAfrom different samples with DNA barcodes and pooling the samples in asequencing reaction. Target enrichment and/or reduction of genomecomplexity for example using restriction enzymes.

In another embodiment, the step of genotyping comprises sequencingpooled amplicons, including captured amplicons. In an embodiment, theamplicons are produced using primers flanking the SNPs, for examplewithin 100 nucleotides upstream and/or within 100 nucleotides downstreamof the SNP location and amplifying targeted region. The resultingamplification products are then sequenced. Forward primers and reverseprimers that amplify for example 25 or more nucleotides surrounding andincluding the SNP can be used in such genotyping methods.

A variety of sequencing methods can be employed includingelectrophoresis-based sequencing technology (e.g. chain terminationmethods, dye-terminator sequencing), by hybridization, mass spectrometrybased sequencing, sequence-specific detection of single-stranded DNAusing engineered nanopores and sequencing by ligation. For example,amplified fragments can be purified and sequenced directly or after gelelectrophoresis and extraction from the gel.

Other PCR based genotyping methods can also be used optionallycomprising DNA amplification using forward and reverse primers and/orprimer extension.

For example the iPLEX Gold Assay by Sequenom® provides a SNP genotypingassay where PCR primers are designed in a region of approximately 100base pairs around the SNP of interest and an extension primer isdesigned adjacent to the SNP. The method involves PCR amplificationfollowed by the addition of Shrimp alkaline phosphatase (SAP) toinactivate remaining nucleotides in the reaction. The primer extensionmixture is then added and the mixture is deposited on a chip for dataanalysis by a TM MALDI-TOF mass spectrometer (Protocol Guide 2008).

In another embodiment, the genotyping method comprises using an allelespecific primer. An example is the KASP™ genotyping system is afluorescent genotyping technology which uses two different allelespecific competing forward primers with unique tail sequences and onereverse primer. Each unique tail binds a unique fluorescent labelledoligo generating a signal upon PCR amplification of the unique tail.

In an embodiment, allele specific probes are utilized. For example, anallele specific probe includes the complementary residue for the targetallele of interest and under specified conditions preferentially bindsthe target allele. The probe can comprise a DNA or RNA polynucleotideand the genotyping step can comprise contacting the test sample with aplurality of probes each of the probes specific for a SNP allele of theset of SNPs under conditions suitable for detecting for example theminor SNP alleles.

In an embodiment, the genotyping method comprises using an array. Thearray can be a fixed or flexible array comprising for example allelespecific probes. The array can be a bead array for example as is theInfinium HD Assay by Illumina. In an embodiment, the array comprisesprimers and/or probes using sequences or parts thereof described in SEQID Nos: 1-600. The array format can comprise primers or probes forgenotyping for example at least 10, 20, 30, 40, 48, 50, 60, 70, 80, 90,96 or 100 or more SNPs, for example any number between and including 1and 300, optionally 10 and 300 or 10 and 200 or 10 and 100. In anembodiment, the array format comprises one or more primers or probes foreach SNP. In an embodiment, the array comprises 96 reactions.

Upstream sequence, the SNP as well as downstream sequence for the SNPsin Tables 4, 5 and 8 are provided in Tables 6, 7 and 9.

As demonstrated in FIG. 7, a level of accuracy can be achieved using the10 SNPs with the highest Fst values. Accordingly in one embodiment, theset of SNPs comprises the first listed 10, 20, 30, 40, 48, 50, 60, 70,80, 90, 96 or 100 SNPs or any number or combination of SNPs between andincluding 10 and 300, optionally 10 and 100 in Table 4, 5 or 8,optionally any combination of SNPs in Tables 4 and/or 8. In anembodiment, the set of SNPs comprises a plurality or all of the SNPs inTable 4 and/or 8 with a Fst of greater than 0.712 or 06277. In anotherembodiment, the set of SNPs comprises a plurality or all of the SNPs inTable 5 with a Fst of greater than 0.679. In an embodiment, the set ofSNPs includes at least 2 wherein the allele frequency is 0.

In an embodiment, any number of SNPS listed in Tables 4 and/or 8, orTable 5 is genotyped.

In an embodiment, a plurality of SNPs listed in Tables 4 and/or 8 and 5are detected. In such methods, both ancestry contribution and marijuanaversus hemp assessments can be conducted in one assay.

In an embodiment, the step of detecting the SNP comprises receiving,reviewing and/or extracting from a file, document, reaction, array ordatabase, the genotype for each of the SNPs of the set.

In certain embodiments, the method further comprises displaying and/orproviding a document displaying one or more features of the major and/orminor alleles. For example, the one or more features can comprise theposition of the SNP, the nucleotide identity of the SNP or thenucleotide identity if a minor allele is detected, the number of readsor reaction, the number of minor alleles, confidence intervals etc. Thedocument can be an electronic document that is provided to a thirdparty. In an embodiment, the one or more features displayed is selectedfrom the allele nucleotide identity and the number of minor alleles incommon with Cannabis sativa, Cannabis indica, marijuana or hemp.

As demonstrated herein, the SNP allele information can be used tocharacterize the cannabis sample. Accordingly, in an embodiment, themethod further comprises determining ancestry contribution of the testsample.

The ancestry contribution is optionally an ancestry contributionestimate or identification of ancestry dominance. For example, theancestry dominance of the test sample can be Cannabis sativa dominant orCannabis indica dominant according to the set of target alleles detectedin step II) ii). If the target alleles in combination when compared to adatabase of cannabis reference strains and/or the reference profilesprovided in Table 4 and 8 are most similar to alleles more commonlyfound in Cannabis sativa, for example if greater than 50% of thecannabis sample's SNPs are alleles more commonly present in Cannabissativa, the cannabis sample is identified as Cannabis sativa dominant.Conversely, if the target alleles in combination are most similar toalleles more commonly found in Cannabis indica, for example if greaterthan 50% of the cannabis sample's SNPs are alleles more commonly presentin Cannabis indica, the cannabis sample is identified as Cannabis indicadominant.

An ancestry contribution estimate is calculated in one embodiment,according to a method described in the Examples. Other calculations fordetermining admixture can also be applied as further described herein.

Other nucleotides may be detected at the SNP positions described or aparticular reaction may fail. In an embodiment, if an allele other thanan allele reported in Tables 4, 5 and 8 is detected or if the nucleotideat the position is unknown, the allele is not considered in the methodsdescribed.

An ancestry contribution estimate can identify a population structurethat is associated or is most likely given the nucleotide occurrences ofthe SNPs in the cannabis sample.

In an embodiment, the method further comprises identifying the testsample as marijuana or hemp, according to the set of target allelesdetected in step II) b). A cannabis sample is identified as hemp forexample if the target alleles in combination when compared to a databaseof cannabis reference strains and/or the reference profiles provided inTable 5 are most similar to alleles more commonly found in hemp, thecannabis sample is identified as hemp. Conversely, if the target allelesin combination are most similar to alleles more commonly found inmarijuana, the cannabis sample is identified as marijuana.

An aspect accordingly includes a method of determining ancestrycontribution of a cannabis sample, optionally to determine if a samplecomprises nabis sativa and/or Cannabis indica, the method comprising:

I) obtaining a test sample comprising genomic DNA,

II) i) genotyping the test sample for a set of single nucleotidepolymorphisms (SNPs), the set comprising at least 10, 20, 30, 40, 48,50, 60, 70, 80, 90, 96 or 100 or more of the SNPs in Table 4 and/or 8,wherein each SNP comprises a major allele and a minor allele as providedin Table 4 and 8; and

-   -   ii) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample; and

III) determining ancestry contribution of the test sample according tothe set of target alleles detected in step II) ii and providing anestimate of the ancestry contribution or the identifying the test sampleas Cannabis sativa dominant or Cannabis indica dominant.

As mentioned above, dominance is assigned as Cannabis sativa dominant orCannabis indica dominant according to the similarity of the detectedalleles. If the set of detected alleles, when compared to a database ofcannabis reference strains and/or the reference profiles provided inTable 4 and/or 8 are most similar to alleles more commonly found inCannabis sativa as indicated in Table 4 and 8, the cannabis sample isassigned as Cannabis sativa dominant. Similarly, if the set of detectedalleles are most similar to alleles more commonly found in Cannabisindica as indicated in Table 4 and 8, the cannabis sample is assigned asCannabis indica dominant.

In an embodiment, the method further comprises selecting a breedingpartner.

The ancestry estimates can be used for example to identify Sativa- orIndica-type breeding individuals when classification is unknown orunsure. As an example, the SNPs described herein can be used to breed anoffspring with a desired or defined contribution, for example aboutequal contribution, of Cannabis indica and Cannabis sativa geneticmaterial. The SNPs in Table 4, 5 and 8 can be used to select formarijuana and hemp, or Indica- and Sativa-type strains with the desiredancestry contribution for use as parents.

For example, these markers can be used in marker-assisted selection(MAS) to breed cannabis plants that contain defined levels ofIndica-type or Sativa-type ancestry.

As another example SNPs as described herein can be used in ancestryselection breeding and used to speed the recovery of the cultivatedgenetic background (as described in [22]). For example in a crossbetween a cultivated line and a wild line, the F1 offspring generatedfrom such a cross necessarily derive 50% of its ancestry from eachparent. On backcrossing to the cultivated line, each offspring willdiffer in the proportion of its ancestry from the wild and cultivatedsources. Genetic markers distributed across the genome can be used toprovide an estimate of the ancestry proportions, and the breeder canthen select the offspring with the highest proportion of cultivatedancestry. Such methods can for example be performed with marker assistedselection (which uses trait associated markers), to select a smallnumber of offspring in each generation that carry both the desired traitfrom the wild and the most cultivate ancestry.

In an embodiment, the method is for assessing if the cannabis sample ismarijuana. For example, the marijuana can be for medical use.

Also provided is a set of SNPs that can be used to determine if a samplecomprises hemp or marijuana. Accordingly another aspect includes amethod for determining if a sample likely comprises hemp and/ormarijuana, the method comprising:

I) obtaining a test sample comprising genomic DNA,

II) a) genotyping the test sample for a set of single nucleotidepolymorphisms (SNPs), the set comprising at least 10, 20, 30, 40, 48,50, 60, 70, 80, 90, 96 or 100 of the SNPs in Table 5, wherein each SNPcomprises a major allele and a minor allele as provided in Table 5; and

-   -   b) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample; and

III) identifying whether the sample likely comprises hemp or marijuanaaccording to the set of target alleles detected in step II) b).

In an embodiment, the method is for differentiatingmedicinal/drug/pharmaceutical andnon-medicinal/non-drug/non-pharmaceutical cannabis.

The identifying step comprises for example comparing to a database ofreference alleles and/or comparing to the reference profiles in Table 5.The comparing step is further described below.

A further aspect includes a method for measuring genetic relatedness ofa cannabis sample to a Cannabis sativa reference and/or a Cannabisindica reference, the method comprising:

I) obtaining a test sample comprising genomic DNA,

II) i) genotyping the test sample for a set of single nucleotidepolymorphisms (SNPs), the set comprising at least 10, 20, 30, 40, 48,50, 60, 70, 80, 90, 96, 100 or any number between 10 and 200 of the SNPsin Table 4 and/or 8, wherein each SNP comprises a major allele and aminor allele as provided in Table 4 and 8; and

-   -   ii) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample;

III) comparing the test sample SNP to the Cannabis sativa referenceand/or Cannabis indica reference according to the set of target allelesdetected in step II) and

IV) displaying and/or providing a document displaying the calculatedgenetic relatedness of the test sample.

In an embodiment, the detecting, identifying and/or comparing stepcomprises calculating the genetic relatedness of the test sample to thecannabis reference, optionally a Cannabis sativa reference and/orCannabis indica reference according to the set of target allelesdetected in step II). The comparing step in an embodiment is carried outusing a computer, for example a computer comprising a database forstoring reference profiles for one or more strains or for the particularCannabis sativa reference and/or Cannabis indica reference.

In an embodiment, the Cannabis sativa reference and/or the Cannabisindica reference is a reference profile or plurality of referenceprofiles stored in a database. The reference profile can for exampleinclude the SNP allele identities (e.g. minor allele) in Table 4 and/or8 and its frequency for the species (e.g. a master reference profile) orthe SNP allele identities of a particular strain.

In some embodiments, the reference is a reference sample and the methodcan comprise genotyping one or more reference samples and the testsample and comparing the detected alleles to identify the number ofmatches.

A further aspect includes a method for measuring a genetic relatednessof a Cannabis sativa sample to a reference marijuana or reference hempsample, the method comprising:

I) obtaining a test sample comprising genomic DNA,

II) a) genotyping the test sample for a set of single nucleotidepolymorphisms (SNPs), the set comprising at least 10, 20, 30, 40, 48,50, 60, 70, 80, 90, 96 or 100 of the SNPs in Table 5, wherein each SNPcomprises a major allele and a minor allele as provided in Table 5; and

-   -   b) detecting for each SNP of the set the presence or absence of        the major allele and/or the minor allele in the test sample;

III) calculating the genetic relatedness of the test sample to themarijuana reference and/or the hemp reference according to the set oftarget alleles detected in step II) b; and

IV) displaying and/or providing a document displaying the calculatedgenetic relatedness of the test sample.

The method of determining ancestry contribution and/or the comparisonfor identifying the sample can involve use of a specifically programmedcomputer using for an example an algorithm to 1) compare the identity ofthe allele e.g whether the major and/or minor allele is detected, foreach of the set of SNPs genotyped in the test sample to one or morecannabis references optionally compared to a database comprising acannabis reference profile such as a master cannabis profile or aplurality of reference profiles, wherein each cannabis reference profilecomprises genotype information for the set of SNPs detected; and 2)assign or calculate the ancestry contribution of the cannabis sample.Any algorithm for admixture analyses can be used. Computer implementedclustering and assignment protocols can also be used. The comparing stepcan also comprise comparing the relative frequency differences.

For example as demonstrated herein, the algorithm can direct a principlecomponents analysis or a fastStructure analysis. For example, asdemonstrated herein, principal component axes can be established using aplurality of cannabis reference strains and/or reference profiles. Acannabis sample genotype can be projected onto the two PCs. The ancestrycontribution of Cannabis sativa for example can then be calculated usingthe formula:% Cannabis sativa+b/(a+b)′,

wherein the a and b are the chord distances along the first principalcomponent from the centroids of the Cannabis sativa strains and theCannabis indica strains respectively.

In an embodiment, the algorithm is an algorithm described in theExamples.

Both the major allele and the minor allele can be detected in a testsample which can be used in determining the ancestry and/or assessingmarijuana and/or hemp relatedness.

Also described herein are isolated nucleic acids, for example as primersor probes to detect the SNPs described herein. Accordingly anotheraspect includes an isolated nucleic acid comprising at least 9, 12, 15or at least 18 contiguous nucleotides of any one of SEQ ID Nos 1-600 orthe complement thereof.

In an embodiment, the isolated nucleic acid is a probe and comprises atleast 12 or at least 18 nucleotides of contiguous sequence including theminor or major allele nucleotide; optionally including upstream sequenceand/or downstream sequence contiguous with the minor or major allele.

In an embodiment, the nucleic acid is a primer comprising an isolatednucleic acid described herein.

In an embodiment, the primer is a forward PCR primer that hybridizeswith a contiguous set of residues within 1-100 of any one of oddnumbered SEQ ID Nos 1-600 or the, complement or reverse complement ofresidues 1-100 of any one of odd numbered SEQ ID Nos 1-600. In anotherembodiment, the primer is a reverse PCR primer (downstream primer) thathybridizes with residues 1 to 100 of any one of even numbered SEQ ID Nos1-600 or the complement or the reverse complement with residues 1 to 100of any one of even numbered SEQ ID Nos 1-600.

In another embodiment, the primer is an allele specific primer for amajor allele and/or a minor allele in Table 4, 5 or 8 and binds toresidue 101 of any one of odd numbered SEQ ID Nos 1-600. The oddnumbered SEQ ID NOs comprise upstream sequence (for example 10 or morenucleotides) and the SNP allele at position 101 (e.g. 90-101). The evennumbered SEQ ID NOs provide downstream sequence as indicated Tables 6, 7and 9. For example SEQ ID NO:1 provides upstream sequence for SNPscaffold14566:24841 at nucleotides 1-100 and the SNP at nucleotide 101.SEQ ID NO:2 provides downstream sequence for this SNP.

In another embodiment, the primer is a primer extension primer and bindsto residue 101 of any one of any one of odd numbered SEQ ID Nos 1-600.

Another aspect includes a plurality of primers for detecting a SNPallele in Table 4, 5 and/or 8, wherein the plurality comprises as least2 different primers selected from primers described herein.

In an embodiment, the plurality is a plurality of primer pairs.

A further aspect is a probe that is specific for an allele.

In yet another embodiment, the primer or probe further comprises acovalently bound tag, optionally a sequence specific nucleotide tail orlabel. The primer or probe nucleotide sequence tag can comprise or canbe coupled to a fluoresecent, radioactive, metal or other detectablelabel.

The primer or probe can also comprise a linker.

Yet a further aspect includes an array, optionally a species specificarray comprising a plurality of nucleic acid probes attached to asupport surface, each isolated nucleic acid probe comprising a sequenceof about 9 to about 100 nucleotides, for example about 9 to about 50nucleotides or about 18 to about 30 nucleotides, wherein the sequence isat least 9, 12, 15 or at least 18 contiguous nucleotides of any one ofSEQ ID NOs: 1-600.

The probe can comprise a sequence that is just upstream of the SNPnucleotide, for example nucleotides 83-100 of any odd numbered SEQ IDNO: 1-600. In an embodiment, the array comprises allele specific probes(nucleic acids optionally labeled), for example wherein the probecomprises upstream sequence and the SNP.

In an embodiment, the array further comprises one or more negativecontrol probes and/or one or more positive control probes.

A further aspect includes a kit comprising an isolated nucleic acid,primer, or plurality of primers and/or array described herein.

The kit can comprise various other reagents for amplifying DNA and/orusing an array to detect a SNP such as dNTPs, polymerase, reactionbuffer, wash buffers and the like. Accordingly in an embodiment, the kitcomprises at least one reagent for an amplifying DNA reaction.

In an embodiment, the kit further comprises at least one reagent for aprimer extension reaction.

In an embodiment, the set for any of the methods, sets, pluralities,kits, nucleic acids or arrays comprises at least 10, 20, 30, 40 of theSNPS in Table 4, 5 and/or 8.

The above disclosure generally describes the present application. A morecomplete understanding can be obtained by reference to the followingspecific examples. These examples are described solely for the purposeof illustration and are not intended to limit the scope of theapplication. Changes in form and substitution of equivalents arecontemplated as circumstances might suggest or render expedient.Although specific terms have been employed herein, such terms areintended in a descriptive sense and not for purposes of limitation.

The following non-limiting examples are illustrative of the presentdisclosure:

EXAMPLES Example 1

Despite its cultivation as a source of food, fibre and medicine, and itsglobal status as the most used illicit drug, the genus Cannabis has aninconclusive taxonomic organization and evolutionary history. Drug typesof Cannabis (marijuana), which contain high amounts of the psychoactivecannabinoid delta-9 tetrahydrocannabinol (THC), are used for medicinalpurposes and as a recreational drug. Hemp types are grown for theproduction of seed and fibre, and contain low amounts of THC. Twospecies or gene pools (C. sativa and C. indica) are widely used indescribing the pedigree or appearance of cultivated cannabis plants.Using 14,031 single-nucleotide polymorphisms (SNPs) genotyped in 81marijuana and 43 hemp samples, marijuana and hemp are found to besignificantly differentiated at a genome-wide level, demonstrating thatthe distinction between these populations is not limited to genesunderlying THC production. There is a moderate correlation between thegenetic structure of marijuana strains and their reported C. sativa andC. indica ancestry.

To evaluate the genetic structure of commonly cultivated Cannabis, 81marijuana and 43 hemp samples were genotyped usinggenotyping-by-sequencing (GBS) [5]. The marijuana samples represent abroad cross section of modern commercial strains and landraces, whilethe hemp samples include diverse European and Asian accessions andmodern varieties. In total, 14,031 SNPs were identified after applyingquality and missingness filters. Principal components analysis (PCA) ofboth marijuana and hemp (FIG. 1a ) revealed clear genetic structureseparating marijuana and hemp along the first principal component (PC1).This distinction was further supported using the fastSTRUCTURE algorithm[6] assuming K=2 ancestral populations (FIG. 1c ). PCA and fastSTRUCTUREproduced highly similar results: a sample's position along PC1 wasstrongly correlated with its group membership according to fastSTRUCTUREat K=2 (r₂=0.964; p-value=3.55×10⁻⁹⁰).

A putative C. indica marijuana strain from Pakistan that is geneticallymore similar to hemp than it is to other marijuana strains wasidentified (FIG. 1a ). Similarly, hemp sample CAN 37/97 clusters moreclosely with marijuana strains (FIG. 1a ). These outliers may be due tosample mix-up or their classification as hemp or marijuana may beincorrect.

These results significantly expand our understanding of the evolution ofmarijuana and hemp lineages in Cannabis. Previous analyses have shownthat marijuana and hemp differ in their capacity for cannabinoidbiosynthesis, with marijuana possessing the B_(T) allele coding fortetrahydrocannabinolic acid synthase and hemp typically possessing theB_(D) allele for cannabidiolic acid synthase [7]. As well, transcriptomeanalysis of female flowers showed that cannabinoid pathway genes aresignificantly upregulated in marijuana compared to hemp, as expectedfrom the very high THC levels in the former compared to the latter [3].The present results indicate that the genetic differences between thetwo are distributed across the genome and are not restricted to lociinvolved in cannabinoid production. In addition, levels ofheterozygosity are higher in hemp than in marijuana (FIG. 1b ,Mann-Whitney U-test, p-value=8.64×10⁻¹⁴), which suggests that hempcultivars are derived from a broader genetic base than that of marijuanastrains and/or that breeding among close relatives is more common inmarijuana than in hemp.

The difference between marijuana and hemp plants has considerable legalimplications in many countries, and to date forensic applications havelargely focused on determining whether a plant should be classified asdrug or non-drug [8]. EU and Canadian regulations only permit hempcultivars containing less than 0.3% THC to be grown. While hemp andmarijuana appear relatively well separated along PC1 (FIG. 1a ), no SNPswith fixed differences were found between these two groups: the highestFST value between hemp and marijuana among all 14,031 SNPs was 0.87 fora SNP with an allele frequency of 0.82 in hemp and 0 in marijuana (Table1).

The average FST between hemp and marijuana is 0.156 (FIG. 3), which issimilar to the degree of genetic differentiation in humans betweenEuropeans and East Asians [9]. Thus, while cannabis breeding hasresulted in a clear genetic differentiation according to use, hemp andmarijuana still largely share a common pool of genetic variation.

Although the taxonomic separation of the putative taxa C. sativa and C.indica remains controversial, a vernacular taxonomy that distinguishesbetween “Sativa” and “Indica” strains is widespread in the marijuanacommunity. Sativa-type plants tall with narrow leaves, are widelybelieved to produce marijuana with a stimulating, cerebral psychoactiveeffect while Indica-type plants, short with wide leaves, are reported toproduce marijuana that is sedative and relaxing. The genetic structureof marijuana is in partial agreement with strain-specific ancestryestimates obtained from various online sources (FIG. 2, Table 2). Amoderate correlation between the positions of marijuana strains alongthe first principal component (PC1) of FIG. 2a and reported estimates ofC. sativa ancestry (FIG. 2c )(r ₂=0.22; p-value=9×10⁻⁶) was observed.This relationship is also observed for the second principal component(PC2) of FIG. 1a (r²=0.23; p-value=6.71×10⁻⁶). This observation suggeststhat C. sativa and C. indica may represent distinguishable pools ofgenetic diversity [1] but that breeding has resulted in considerableadmixture between the two. While there appears to be a genetic basis forthe reported ancestry of many marijuana strains, in some cases theassignment of ancestry strongly disagrees with our genotype data. Forexample Jamaican Lambs Bread (100% reported C. sativa) was nearlyidentical (IBS=0.98) to a reported 100% C. indica strain fromAfghanistan. Sample mix-up cannot be excluded as a potential reason forthese discrepancies, but a similar level of misclassification was foundin strains obtained from Dutch coffee shops based on chemicalcomposition [10]. The inaccuracy of reported ancestry in marijuanalikely stems from the predominantly clandestine nature of Cannabisgrowing and breeding over the past century. Recognizing this, marijuanastrains sold for medical use are often referred to as Sativa or Indica“dominant” to describe their morphological characteristics andtherapeutic effects [10]. The results suggest that the reported ancestryof some of the most common marijuana strains only partially capturestheir true ancestry.

Materials and Methods

Genetic material and genotyping. The marijuana strains genotyped weregrown by Health Canada authorized producers and represent germplasmgrown and used for breeding in the medical and recreational marijuanaindustries (Table 2). Hemp strains were obtained from a Health Canadahemp cultivation licensee, and represent modern seed and fibre cultivarsgrown in Canada as well as diverse European and Asian germplasm (Table3). DNA was extracted from leaf tissue using standard protocols, andlibrary preparation and sequencing were performed using the GBS protocolpublished by Sonah et al [15]. SNPs were called using the GBS pipelinedeveloped by Gardner et al. [16], aligning to the canSat3 C. sativareference genome assembly [3]. Quality filtering of genetic markers wasperformed in PLINK [17] by removing SNPs with (i) greater than 20%missingness by locus (ii) a minor allele frequency less than 1% and(iii) excess heterozygosity (a Hardy-Weinberg equilibrium p-value lessthan 0.0001). After filtering, 14,031 SNPs remained for analysis.

Collection of reported marijuana ancestry. Reported ancestry proportions(% C. sativa and % C. indica) were manually obtained from online straindatabases, cannabis seed retailers, and licensed producers of medicalmarijuana (Table 2). Ancestry estimates for 26 strains for which noonline information was available were assigned.

Analysis of population structure and heterozygosity. Principalcomponents analysis (PCA) was performed using the adegenet v1.4-2package [18] in R v3.1.1 using default parameters. fastSTRUCTURE [6] wasrun at K=2 and K=3 using default parameters for hemp and marijuanasamples combined (14,031 SNPs) (FIG. 1a,c ), and marijuana samples alone(10,651 SNPs) (FIG. 2a,b ). Heterozygosity by individual was calculatedin R by dividing the number of heterozygous sites by the number ofnon-missing genotypes for each sample.

Identity by state (IBS) Analysis. Pairwise proportion IBS between allpairs of samples was calculated using PLINK. One outlier was excludedfrom this analysis, C. indica (Pakistan), because of its significantlyhigher IBS to hemp than all other marijuana strains (Labeled marijuanasample in FIG. 1a ).

To determine if the hemp population shared greater allelic similarity toC. sativa or C. indica marijuana, the mean pairwise IBS was calculatedbetween each marijuana strain and all hemp strains. This analysis wasperformed at various minor allele frequency thresholds and the resultremained unchanged.

Example 2

Selection of Cannabis Informative Markers

Nine reported C. indica and 9 reported C. sativa individuals wereselected to form ancestral populations for the selection of geneticmarkers that are able to differentiate the two groups. Individuals wereselected manually on the basis of both their position along the firstprincipal component in FIG. 5 (actual genetic structure observed using9776 SNPs), as well as their reported C. sativa or C. indica ancestry.

Selection of Ancestry Informative Markers (AIMs)

The top 100 highest F_(ST) SNPs were extracted and evaluated for theiruse in estimating genetic structure, which in the present case is beingused as a proxy for C. sativa/C. indica ancestry given theunavailability of true pure C. sativa and C. indica populations. Thesame was performed between hemp cultivars and marijuana strains ofCannabis (FIG. 1a ).

Example Evaluation of AIMs for Estimating Population Structure

Assuming the first principal component of FIG. 5 is representative ofpopulation structure between C. indica and C. sativa type marijuanastrains, a strain's position along the X axis (PC1) represents geneticsimilarity to each population. In the case of admixed individuals, theposition could be representative of genomic contribution from the C.indica and C. sativa gene pools. For the purposes of this analysis, anindividual's position along PC1 using 9776 SNPs (FIG. 5), is consideredto be an individual's true ancestry. By projecting samples on toprincipal components computed using only the ancestral populations,additional samples can be added to the analysis without changing therelative positions of our ancestral strains in PC space and thecentroids of the clusters can be used as anchors along PC1 forestimating ancestry. Because not every SNP will contribute equally to anindividual's position along PC1, a subset of markers that will capturenearly all the variance accounted for by that component is selected.

First, the 2 highest F_(ST) SNPs are selected, and used to perform PCAusing only the ancestral C. sativa and C. indica populations. The restof the samples are then projected (n=63) onto those components, andtheir positions along PC1 stored. To determine the accuracy of this 2marker panel, the Pearson's product moment correlation coefficient (Yaxis, FIG. 6) was calculated between these positions and the positionscalculated using the full set of 9766 SNPs. The next highest F_(ST) SNPwas added to the panel and this process was repeated for all 100 highestF_(ST) markers. Accuracy is not improved within this dataset for markerpanels of more than approximately 40 of the highest F_(ST) SNPs withinthis population (FIG. 7). Additional ancestry informative SNPs mayprovide greater accuracy in novel samples and can provide redundancy inthe event of failed genotyping reactions.

Example 3

Weighting of SNPs:

To rank SNPs according to their ancestry informativeness, the fixationindex (F_(ST)) according to Weir and Cockerham (1984) was calculated foreach marker. This estimate ranges between 0 and 1, where a SNP withF_(ST)=1 has an allele found at 100% frequency in one population, and 0%frequency in another.

Willing, Dreyer, and van Oosterhout (2012) [21] describe the calculationas follows:

“At a single locus k, F_(ST) ^(W&C) is defined as

${\hat{F}}_{ST}^{\lbrack k\rbrack} = \frac{{\hat{N}}^{\lbrack k\rbrack}}{{\hat{D}}^{\lbrack k\rbrack}}$where${\hat{N}}^{\lbrack k\rbrack} = {s^{2} - {\frac{1}{{2n} - 1}\left\lbrack {{\overset{\_}{p}\left( {1 - \overset{\_}{p}} \right)} - {\frac{r - 1}{r}s^{2}} - \frac{\overset{\_}{h}}{4}} \right\rbrack}}$${\hat{D}}^{\lbrack k\rbrack} = {{\overset{\_}{p}\left( {1 - \overset{\_}{p}} \right)} + \frac{s^{2}}{r}}$

Here, s² is the observed variance of allele frequencies, n is the numberof individuals per population, p is the mean allele frequency over allpopulations, r is the number of sampled populations and h is the meanobserved heterozygosity.”

Example 4

Population Assignment

Population assignment can be performed if the novel sample has beengenotyped for ancestry informative markers for which the alleles andallele frequencies are already known in the ancestral populations.

A test sample of a cannabis sample to be characterized is obtained. Thetest sample is genomic DNA and the genomic DNA is subjected togenotyping of at least 10 of the markers in Table 4 or Table 5 dependingon whether it is desired that the ancestral contribution be determinedor the sample be identified as marijuana or hemp.

An assignment test developed by Paetkau et al (23) and described inHansen, Kenchington and Nielsen (2001) (24) can be used.

For each cannabis sample being assigned, the log-likelihood of it beingderived from a specific population is calculated as:

$\begin{matrix}{\log\left( {{{\prod\limits_{l = 1}^{n}\;{p_{ij}^{2}\mspace{14mu}{for}\mspace{14mu} i}} = j},{{{and}\mspace{14mu} 2p_{i}p_{j}\mspace{14mu}{for}\mspace{14mu} i} \neq j}} \right)} & {{Equation}\mspace{14mu} 1}\end{matrix}$where n denotes the number of loci, I and j denote the two alleles atthe

h locus, and p_(i) and p_(j) denote the frequency of the ith and jthallele of the

h locus in the population being considered.

Calculations are made for each population using the loci and frequenciesprovided in Table 4 or 5, and the cannabis sample is assigned to thepopulation in which it has the highest likelihood of belonging.

Example 5

Ancestry Estimation

Calculation of a novel sample's hybridization index (e.g. ancestrycontribution) can be performed if the novel sample has been genotypedfor ancestry informative markers for which the alleles and allelefrequencies are already known in the ancestral populations.

Ancestry analysis can determine if the cannabis sample is a ‘pure’descendant of a reference sample or reference profile or if it is theresult of interbreeding between individuals from two differentpopulations, i.e. an admixed individual or ‘intraspecific hybrid’.

Campton and Utter (1985) developed a “hybrid index” (25). The hybridindex can be regarded as a way of visualizing the relative assignmentprobabilities in an assignment test involving two parental populations.The hybrid index, I_(H), requires three samples (or a sample and tworeference profiles), i.e. a sample or reference profile of each of thetwo possible parental populations and a sample of the group of suspected‘hybrids’.

I_(H) is calculated as:

$\begin{matrix}{I_{H} = {1 - \frac{\log\left( p_{x} \right)}{{\log\left( p_{x} \right)} + {\log\left( p_{y} \right)}}}} & {{Equation}\mspace{14mu} 2}\end{matrix}$where p_(x) denotes the likelihood of the multilocus genotype of anindividual in population x and p_(y) similarly denotes the likelihood inpopulation y, calculated as in equation 1.”

Example 6

SNP Discovery

Genetic material and genotyping. The marijuana strains genotyped weregrown by Health Canada authorized producers and represent germplasmgrown and used for breeding in the medical and recreational marijuanaindustries (Table 10). DNA was extracted from leaf tissue using standardprotocols, and library preparation and sequencing were performed usingthe GBS protocol published by Poland et al [26] SNPs were called usingthe GBS pipeline developed by Melo et al. [27], aligning to the canSat5C. sativa reference genome assembly (unpublished). Quality filtering ofgenetic markers was performed in PLINK [17] by removing SNPs with (i)greater than 20% missingness by locus (ii) a minor allele frequency lessthan 1% and (iii) excess heterozygosity (a Hardy-Weinberg equilibriump-value less than 0.0001). After filtering, 9,123 SNPs remained foranalysis.

Table 8 identifies the major and minor alleles identified in Cannabissativa and Cannabis indica and Table 9 provides upstream and downstreamsequence for each SNP. Table 10 provides reference information on thereported ancestry. FIG. 8 shows a PCA analysis based on whether thestrain is reported as C indica or C sativa.

TABLE 1 Positions and allele frequencies of the top 50 SNPs by FSTbetween marijuana and hemp calculated according to equation 10 in Weirand Cockerham (1984) [19]. SNP FST Between Non-Reference Non-ReferencePosition on Hemp and Reference Non-Reference Allele Frequency AlleleFrequency Scaffold Scaffold Marijuana Allele Allele in Marijuana in Hempscaffold13038 51303 0.865332249 C A 0 0.8158 scaffold25092 118410.85295301 A G 0.08333 0.9583 scaffold23837 26190 0.81783437 C T 0.049380.8605 scaffold152474 1505 0.813648101 C A 0.01282 0.7857 scaffold1524741465 0.810651442 G A 0.03205 0.8214 scaffold13038 51162 0.805163299 C T0 0.7436 scaffold5841 136325 0.782037623 A T 0.006329 0.7059scaffold5876 22669 0.779767209 A T 0.175 1 scaffold764 75880 0.770853139C A 0 0.7024 scaffold32076 9119 0.767411438 C A 0.01316 0.7143scaffold7992 917 0.763602811 T A 0.006329 0.7073 scaffold118405 29160.761994625 C T 0.03125 0.7564 scaffold2418 77480 0.758065142 G T 00.6618 scaffold34829 1873 0.746880355 T C 0.1645 0.9535 scaffold381254641 0.743833337 G T 0.03704 0.7558 scaffold37469 74270 0.740257391 G A0.1013 0.8611 C32100775 1618 0.736074476 T C 0.01316 0.6613 scaffold519041424 0.734073333 C G 0.109 0.869 scaffold5190 41454 0.734073333 G A0.109 0.869 scaffold49917 1105 0.726261988 A G 0.08642 0.8256scaffold4775 75747 0.719005761 T A 0 0.625 scaffold26152 154320.712320041 G A 0.05921 0.7571 scaffold5841 135056 0.712144097 G A 00.6429 scaffold5876 106401 0.711616024 A G 0 0.6395 scaffold5113 1388950.711038199 A G 0 0.6111 scaffold5113 239777 0.710611128 G A 0 0.631scaffold5841 135067 0.70853417 G A 0 0.6429 scaffold25099 73210.702374753 A G 0.05556 0.7439 scaffold14172 40875 0.702287484 A C0.01282 0.5682 scaffold158332 502 0.702084439 A G 0.006173 0.6395scaffold16869 149709 0.700892589 T G 0.1562 0.9024 scaffold114539 11190.695979272 C T 0 0.6111 scaffold3842 272682 0.69126068 T G 0.01266 0.6scaffold60331 8510 0.690556532 G C 0.03425 0.6892 scaffold2360 221090.689941474 G A 0.07143 0.7639 scaffold72613 1728 0.689632536 G A0.04321 0.7093 C32052717 309 0.688864374 A T 0.142 0.8721 scaffold7194313435 0.685008651 C T 0.01852 0.6512 scaffold71943 13471 0.685008651 T C0.01852 0.6512 scaffold128544 170 0.679301025 G T 0.01875 0.6316scaffold98263 1069 0.677597364 T C 0.07407 0.7558 C32058675 2920.673619414 A G 0.01852 0.6395 scaffold823 11824 0.673537258 C T 0.012990.6111 scaffold75287 5899 0.673496392 G T 0.04321 0.6905 scaffold16869149730 0.671387324 C T 0.1562 0.875 scaffold3842 543149 0.670719701 C T0.02469 0.6212 C32058613 508 0.67046277 T C 0 0.5588 C32058613 5630.67046277 C T 0 0.5588 scaffold7146 70340 0.669199626 C T 0.1667 0.8889scaffold23125 41276 0.669066022 G A 0.03846 0.675

TABLE 2 Sample names and reported C. sativa and C. indica ancestry ofgenotyped marijuana strains. Reported Reported Proportion ProportionSample ID Sample Name C. Sativa C. indica Reference for ReportedAncestry P2_A01_M_0001 C. indica 0 100 Author D H (Afghanistan)P2_A02_M_0002 Ata Tundra 0 100http://www.gorilla-cannabis-seeds.co.uk/seedsman/regular/ata-tundra.htmlP2_A03_M_0003 Big Bang 20 80http://www.kindgreenbuds.com/marijuana-strains/big-bang/ P2_A04_M_0004Big Bang 10 80http://azarius.net/seedshop/greenhouseseeds/big_bang_autoflowering_greenhouse/(Autoflowering) P2_A05_M_0005 Dr. Grinspoon 100 0http://www.leafly.com/sativa/dr-grinspoon P2_A06_M_0006 Hash Passion 0100 http://www.seedsman.com/en/hash-passion-seeds P2_A07_M_0007 IndianHaze 100 0 http://en.seedfinder.eu/strain-info/Indian_Haze/Seedsman/(Haze Mist) P2_A09_M_0009 King Kush 30 70http://grow-marijuana.com/strain-reviews/king-kush P2_A10_M_0010 MasterKush 5 95 http://sensiseeds.com/en/cannabis-seeds/whitelabel/master-kushP2_A11_M_0011 Master Kush 5 95http://sensiseeds.com/en/cannabis-seeds/whitelabel/master-kushP2_A12_M_0012 Master Kush 5 95http://sensiseeds.com/en/cannabis-seeds/whitelabel/master-kushP2_B01_M_0013 Master Kush 5 95http://sensiseeds.com/en/cannabis-seeds/whitelabel/master-kushP2_B03_M_0015 Neville's Haze 75 25http://www.leafly.com/hybrid/nevilles-haze P2_B04_M_0016 C. indica 0 100Author D H (Pakistan) P2_B05_M_0017 C. sativa 100 0 Author D H (SouthAfrica) P2_B07_M_0019 White Rhino 10 90http://www.kindgreenbuds.com/marijuana-strains/white-rhino/P2_B08_M_0020 White Widow 50 50http://www.royalqueenseeds.com/122-white-widow.html P2_B09_M_0021Chemdawg 50 50http://www.tweed.com/collections/all-strains/products/donegal-chem-dawgP2_B10_M_0022 Vanilla Haze 20 80https://www.barneysfarmshop.com/seeds/vanilla-kush.html P2_B12_M_0024Sunshine 90 10 Author D H P2_C01_M_0025 NL5 Haze Mist 50 50http://www.popularseeds.com/green-house-seeds/nl5-haze-mistP2_C03_M_0027 El Nino 40 60http://www.kindgreenbuds.com/marijuana-strains/el-nino/ P2_C04_M_0028Shark 25 75 http://www.weedyard.com/Strains/SharkShock.htmlP2_C05_M_0029 King Kush 30 70 http://www.wikileaf.com/strain/kings-kush/P2_C06_M_0030 Dr. Grinspoon 100 0http://www.leafly.com/sativa/dr-grinspoon P2_C07_M_0031 Exodus Cheese 4060http://www.gorilla-cannabis-seeds.co.uk/greenhouseseeds/feminized/exodus-cheese-feminized.html P2_C08_M_0032 Jenni 75 25 Author D H P2_C09_M_0033Hawaiian Snow 90 10http://www.kindgreenbuds.com/marijuana-strains/hawaiian-snow/P2_C10_M_0034 GH Cheese 40 60http://azarius.net/seedshop/greenhouseseeds/cheese_greenhouse_feminised/P2_C11_M_0035 Kalishnikova 20 80http://azarius.net/seedshop/greenhouseseeds/kalashnikova_greenhouse_feminized/P2_D01_M_0037 Great White 25 75http://www.kindgreenbuds.com/marijuana-strains/white-shark/ SharkP2_D02_M_0038 Strawberry 70 30http://www.kindgreenbuds.com/marijuana-strains/arjans-strawberry-haze/Haze P2_D03_M_0039 Himalayan 0 100http://www.kindgreenbuds.com/marijuana-strains/himalaya-gold/ GoldP2_D04_M_0040 Ortega BC 0 100 http://www.leafly.com/indica/ortegaP2_D05_M_0041 Atomic Haze 80 20http://www.cannaseur.com/index.php/online-store/female-seeds/atomic-haze-female-detail P2_D06_M_0042 Domina Haze 85 15 Author D HP2_D07_M_0043 Rio 25 75 Author D H P2_D08_M_0044 Damn Sour 60 40http://www.cannabissearch.com/strains/damn-sour/ P2_D09_M_0045 Nina 7525 http://www.kindgreenbuds.com/marijuana-strains/la-nina/ P2_D10_M_0046White Widow 50 50 http://www.royalqueenseeds.com/122-white-widow.htmlP2_E01_M_0049 Purple Sativa 70 30 Author D H P2_E02_M_0050 Bubba Kush 1090http://www.tweed.com/collections/all-strains/products/norfolk-bubba-kushP2_E03_M_0051 Delahaze 70 30https://www.paradise-seeds.com/en/delahaze.html P2_E05_M_0053 SuperSilver 75 25http://www.tweed.com/collections/all-strains/products/leonidas-super-silver-hazeHaze P2_E06_M_0054 Jack Herer 70 30http://www.tweed.com/collections/all-strains/products/birds-eye-jack-hererP2_E07_M_0055 Pennywise 40 60http://www.tweed.com/collections/all-strains/products/nova-pennywiseP2_E08_M_0056 White Berry 25 75http://www.harborsidehealthcenter.com/learn/white-berry-medical-cannabis.htmlP2_E09_M_0057 Skunk Haze 55 45http://www.kindgreenbuds.com/marijuana-strains/skunk-haze/ P2_E10_M_0058Durban Poison 100 0 http://www.leafly.com/sativa/durban-poisonP2_E12_M_0060 Happy Face 25 75 Author D H P2_F01_M_0061 White Rhino 1090 http://www.kindgreenbuds.com/marijuana-strains/white-rhino/P2_F03_M_0063 Ice Cream 40 60http://www.kindgreenbuds.com/marijuana-strains/ice-cream/ P2_F04_M_0064C. sativa 100 0 Author D H (Thailand/Laos) P2_F05_M_0065 Diamond Girl 4060 Author D H (Silver Pearl) P2_F06_M_0066 AMS 30 70http://www.kindgreenbuds.com/marijuana-strains/ams/ P2_F07_M_0067 LemonSkunk 60 40 http://www.kindgreenbuds.com/marijuana-strains/lemon-skunk/P2_F08_M_0068 Arjans Haze #2 90 10http://www.kindgreenbuds.com/marijuana-strains/arjans-haze-2/P2_F10_M_0070 White Domina 0 100http://www.headsite.com/white-domina-feminised-seeds-kannabia-446-p.aspP2_F11_M_0071 Blue Hell 20 80http://www.cannabis-seeds.co.uk/medicalseeds/bluehell.html P2_F12_M_0072Neville's White 80 20 Author D H Widow P2_G01_M_0073 Alaskan Ice 70 30http://www.kindgreenbuds.com/marijuana-strains/alaskan-ice/P2_G03_M_0075 Arjans Ultra 80 20http://www.cannabissearch.com/strains/arjans-ultra-haze/ Haze #1P2_G04_M_0076 Ken's Sweet 20 80https://www.barneysfarmshop.com/seeds/barneys-farm-sweet-tooth.htmlTooth P2_G05_M_0077 Neville's Haze 75 25http://www.kindgreenbuds.com/marijuana-strains/nevilles-haze/P2_G06_M_0078 Arjans Haze #3 80 20http://www.kindgreenbuds.com/marijuana-strains/arjans-haze-3/P2_G07_M_0079 Cupid 50 50 Author D H P2_G08_M_0080 Super Critical 25 75http://www.wikileaf.com/strain/super-critical/ P2_G09_M_0081 Super Bud35 65http://www.kindgreenbuds.com/marijuana-strains/ed-rosenthal-super-bud/P2_G10_M_0082 LadyBurn 1974 50 50http://www.cannabissearch.com/strains/ladyburn-1974/ P2_G11_M_0083Trainwreck 90 10 http://www.wikileaf.com/strain/trainwreck/P2_G12_M_0084 C. indica 0 100 Author D H (Afghanistan) P2_H01_M_0085Almighty 50 50 Author D H Whitey P2_H02_M_0086 C. sativa 100 0http://www.seedsman.com/en/guatemala-regular-seeds (Guatemala)P2_H03_M_0087 C. sativa 100 0 Author D H (Laos) P2_H04_M_0088 La Rienade 100 0 http://www.seedsman.com/en/la-reina-de-africa-feminised-seedsAfrica P2_H05_M_0089 Raspberry 70 30http://www.kindgreenbuds.com/marijuana-strains/raspberry-cough/ CoughP2_H06_M_0090 Super Lemon 80 20http://www.wikileaf.com/strain/super-lemon-haze/ Haze P2_H08_M_0092Jocelyn 60 40 Author D H P2_H09_M_0093 Big Bang 20 80http://www.kindgreenbuds.com/marijuana-strains/big-bang/ P2_H10_M_0094C. indica 0 100 Author D H (Afghanistan) P2_H11_M_0095 Jamaican 100 0http://www.leafly.com/sativa/lamb-s-bread Lambs Bread

TABLE 3 Sample names of genotyped hemp varieties. Sample ID Sample NameP1_A02_H_0001_2 Felina P1_A04_H_0002_2 Ferimon P1_A06_H_0003_2 KompoltiP1_A08_H_0004_2 Uniko B P1_A10_H_0005_2 Fedora 19 P1_A11_H_0006_1 Futura77 P1_B01_H_0007_1 Fedrina P1_B04_H_0008_2 Suditalien or SudiP1_B05_H_0009_1 LKSD or LKCSD P1_B09_H_0011_1 BialobrzeskieP1_B12_H_0012_2 VIR 541 P1_C01_H_0013_1 VIR 569 P1_C04_H_0014_2 VIR 575P1_C05_H_0015_1 Silesia P1_C08_H_0016_2 VIR 577 P1_C10_H_0017_2Carmagnola P1_D02_H_0019_2 Zolotonsha 15 P1_D03_H_0020_1 Fedora 17P1_D05_H_0021_1 K110 P1_D07_H_0022_1 Novosadska P1_D09_H_0023_1 Jus 8P1_E01_H_0025_1 Delores P1_E03_H_0026_1 Petera P1_E05_H_0027_1 CAN 29/94P1_E07_H_0028_1 CAN 37/97 P1_E10_H_0029_2 CAN 40/99 P1_F03_H_0032_1 CAN39/98 P1_F06_H_0033_2 CAN 100/01 P1_F07_H_0034_1 CAN 18/95P1_F09_H_0035_1 CAN 20/02 P1_F11_H_0036_1 CAN 24/89 P1_G01_H_0037_1 CAN23/99 P1_G04_H_0038_2 CAN 17/95 P1_G05_H_0039_1 CAN 19/87P1_G07_H_0040_1 CAN 22/88 P1_G10_H_0041_2 CAN 26/93 P1_G11_H_0042_1 CAN16/94 P1_H02_H_0044_1 CAN 28/01 P1_H04_H_0045_1 ChameleonP1_H07_H_0046_2 Tygra P1_H09_H_0047_2 Carmen P1_H10_H_0048_1 AlyssaP2_H12_H_0096 Finola

TABLE 4 Positions and allele frequencies of the top 100 SNPs by FSTbetween Cannabis Sativa and Cannabis Indica calculated according toequation 10 in Weir and Cockerham (1984) [19] Minor Allele Minor AlleleSEQ SEQ Minor Major Frequency Frequency SNP Name ID ID Allele Allele FST(Indica) (Sativa) scaffold14566:24841 1 2 C T 1 0 1 scaffold2257:59436 34 A C 0.941 0 0.9444 scaffold123303:7086 5 6 A C 0.938 0 0.9444scaffold21832:18317 7 8 C T 0.937 0 0.9375 scaffold10653:22776 9 10 C T0.937 0 0.9375 scaffold34968:5203 11 12 C A 0.933 0 0.9375scaffold41828:12391 13 14 T C 0.883 0 0.8889 C32084869:1171 15 16 A T0.876 0.9444 0.05556 scaffold1342:67015 17 18 C T 0.876 0.05556 0.9444scaffold5876:136612 19 20 C T 0.876 0.05556 0.9444 scaffold10653:2275521 22 G A 0.868 0.05556 0.9375 scaffold65043:4386 23 24 C A 0.866 00.875 scaffold39548:5676 25 26 T C 0.866 0 0.875 scaffold39548:5703 2728 A G 0.866 0 0.875 scaffold17605:9224 29 30 G A 0.866 0 0.875scaffold94301:10558 31 32 C T 0.858 0.0625 0.9375 scaffold52608:15151 3334 G A 0.857 0 0.8571 scaffold9110:12755 35 36 G A 0.845 0.8571 0C32099389:1875 37 38 G A 0.835 0.0625 0.9167 C32076905:769 39 40 G T0.825 0 0.8333 scaffold50412:1217 41 42 T C 0.825 0.8333 0scaffold73281:3231 43 44 T C 0.825 0 0.8333 scaffold60591:9229 45 46 T C0.825 0 0.8333 scaffold60591:9364 47 48 G T 0.825 0 0.8333scaffold27976:11462 49 50 T A 0.825 0.8333 0 scaffold4591:18700 51 52 AG 0.825 0.8333 0 scaffold96873:16326 53 54 A G 0.817 0 0.8333scaffold6777:12771 55 56 C T 0.816 0 0.8333 C31894837:130 57 58 A G0.812 0 0.8125 scaffold1342:66883 59 60 C T 0.812 0.8125 0 C32035477:26461 62 G A 0.811 0.8889 0.05556 scaffold6360:1717 63 64 T C 0.811 0.055560.8889 scaffold132623:4144 65 66 T C 0.811 0.8889 0.05556scaffold72006:11851 67 68 T A 0.809 0 0.8 scaffold6742:34122 69 70 C T0.807 0 0.8333 scaffold3108:8782 71 72 G A 0.806 0 0.8333scaffold11225:9528 73 74 A G 0.806 0 0.75 scaffold11225:9539 75 76 T A0.806 0 0.75 scaffold2579:59858 77 78 A G 0.803 0.8889 0.05556scaffold5876:136559 79 80 T C 0.803 0.8889 0.05556 scaffold5876:13660681 82 T C 0.803 0.8889 0.05556 scaffold23386:12397 83 84 A G 0.803 00.8125 scaffold93032:4944 85 86 A G 0.791 0 0.8125 scaffold94004:1366387 88 A G 0.791 0.8125 0 scaffold26621:73038 89 90 C A 0.791 0 0.8125scaffold26621:73058 91 92 G A 0.791 0 0.8125 scaffold62259:15201 93 94 CA 0.781 0 0.75 scaffold38801:13810 95 96 A G 0.768 0 0.7778scaffold6550:117010 97 98 C T 0.768 0 0.7778 scaffold94863:28465 99 100T C 0.767 0.1667 1 scaffold130551:553 101 102 T A 0.759 0 0.7778scaffold21832:579 103 104 C A 0.759 0 0.7778 C32090201:1470 105 106 C T0.759 0 0.7778 scaffold117639:1939 107 108 C A 0.759 0 0.7778scaffold73281:2531 109 110 A G 0.759 0 0.7778 scaffold95390:2586 111 112T C 0.759 0 0.7778 scaffold109105:3417 113 114 C T 0.759 0.7778 0scaffold9670:13701 115 116 T C 0.759 0 0.7778 scaffold45478:20587 117118 C T 0.759 0 0.7778 scaffold23700:29096 119 120 C A 0.759 0.7778 0scaffold10732:30120 121 122 A G 0.759 0 0.7778 scaffold16969:31933 123124 T C 0.759 0.7778 0 scaffold3884:40695 125 126 A G 0.759 0 0.7778scaffold829:52127 127 128 A G 0.759 0 0.7778 scaffold6550:117004 129 130G A 0.759 0.7778 0 scaffold27604:1398 131 132 A G 0.757 0 0.6667scaffold125644:4761 133 134 G A 0.757 0.7778 0 scaffold16027:10666 135136 C T 0.757 0 0.7778 scaffold2502:17437 137 138 G A 0.757 0 0.7778scaffold2502:17515 139 140 C G 0.757 0 0.7778 scaffold70502:1951 141 142C T 0.754 0.8571 0.0625 scaffold40620:28184 143 144 T C 0.751 0 0.75scaffold40620:28194 145 146 G A 0.751 0 0.75 scaffold40620:28201 147 148A T 0.751 0 0.75 scaffold118158:666 149 150 C G 0.74 0.1111 0.8889scaffold41951:881 151 152 T A 0.74 0.1111 0.8889 scaffold95666:9974 153154 T C 0.74 0.8889 0.1111 scaffold12645:86648 155 156 C T 0.74 0.11110.8889 scaffold6627:26364 157 158 T C 0.738 0.05556 0.8333 C32050599:443159 160 T C 0.738 0.75 0 scaffold20861:14886 161 162 A G 0.737 0.7778 0scaffold30119:28969 163 164 T A 0.732 0.8333 0.0625 scaffold2257:75397165 166 T C 0.731 0 0.7778 scaffold46867:905 167 168 T C 0.73 0.7143 0scaffold65132:21260 169 170 A T 0.729 0 0.75 scaffold94863:28441 171 172C T 0.729 0.1667 1 scaffold94004:13590 173 174 T C 0.726 0.75 0scaffold94004:13632 175 176 T A 0.726 0.75 0 scaffold39420:9067 177 178T C 0.723 0.1111 0.875 scaffold42291:6484 179 180 C T 0.717 0.7143 0scaffold23828:34435 181 182 G T 0.714 0.9286 0.1667 scaffold15017:4539183 184 C T 0.714 0.07143 0.8333 scaffold16607:2589 185 186 T C 0.7130.875 0.1111 scaffold27758:4907 187 188 G A 0.713 0.875 0.1111scaffold20809:7695 189 190 G C 0.713 0.875 0.1111 scaffold36583:13571191 192 T C 0.713 0.875 0.1111 scaffold36500:1728 193 194 T C 0.712 00.7222 scaffold36500:1740 195 196 C T 0.712 0 0.7222 scaffold36500:1749197 198 T C 0.712 0 0.7222 scaffold153198:2269 199 200 A T 0.712 00.7222

TABLE 5 Positions and allele frequencies of the top 100 SNPs by FSTbetween marijuana and hemp calculated according to equation 10 in Weirand Cockerham (1984) [19] Minor Allele Minor Allele SEQ SEQ Minor MajorFrequency Frequency SNP Name ID ID Allele Allele FST (Marijuana) (Hemp)scaffold13038:51303 201 202 C A 0.865332249 0 0.8378 scaffold25092:11841203 204 A G 0.85295301 0.08333 0.9571 scaffold23837:26190 205 206 C T0.81783437 0.0375 0.881 scaffold152474:1505 207 208 C A 0.813648101 00.8049 scaffold152474:1465 209 210 G A 0.810651442 0.01948 0.8415scaffold13038:51162 211 212 C T 0.805163299 0 0.7632 scaffold5841:136325213 214 A T 0.782037623 0.00641 0.7273 scaffold5876:22669 215 216 A T0.779767209 0.1646 1 scaffold764:75880 217 218 C A 0.770853139 0 0.7195scaffold32076:9119 219 220 C A 0.767411438 0.01316 0.7353scaffold7992:917 221 222 T A 0.763602811 0.00641 0.725scaffold118405:2916 223 224 C T 0.761994625 0.03165 0.75scaffold2418:77480 225 226 G T 0.758065142 0 0.6818 scaffold34829:1873227 228 T C 0.746880355 0.1533 0.9524 scaffold38125:4641 229 230 G T0.743833337 0.0375 0.7738 scaffold37469:74270 231 232 G A 0.7402573910.1026 0.8611 C32100775:1618 233 234 T C 0.736074476 0.006667 0.6613scaffold5190:41424 235 236 C G 0.734073333 0.1104 0.878scaffold5190:41454 237 238 G A 0.734073333 0.1104 0.878scaffold49917:1105 239 240 A G 0.726261988 0.075 0.8452scaffold4775:75747 241 242 T A 0.719005761 0 0.6429 scaffold26152:15432243 244 G A 0.712320041 0.04667 0.7794 scaffold5841:135056 245 246 G A0.712144097 0 0.6585 scaffold5876:106401 247 248 A G 0.711616024 00.6548 scaffold5113:138895 249 250 A G 0.711038199 0 0.6111scaffold5113:239777 251 252 G A 0.710611128 0 0.6463 scaffold5841:135067253 254 G A 0.70853417 0 0.6585 scaffold25099:7321 255 256 A G0.702374753 0.04375 0.7439 scaffold14172:40875 257 258 A C 0.702287484 00.5952 scaffold158332:502 259 260 A G 0.702084439 0 0.6548scaffold16869:149709 261 262 T G 0.700892589 0.1456 0.9scaffold114539:1119 263 264 C T 0.695979272 0 0.6111 scaffold3842:272682265 266 T G 0.69126068 0 0.6207 scaffold60331:8510 267 268 G C0.690556532 0.02083 0.7083 scaffold2360:22109 269 270 G A 0.6899414740.05797 0.7857 scaffold72613:1728 271 272 G A 0.689632536 0.04375 0.7262C32052717:309 273 274 A T 0.688864374 0.1437 0.8929 scaffold71943:13435275 276 C T 0.685008651 0.01875 0.6667 scaffold71943:13471 277 278 T C0.685008651 0.01875 0.6667 scaffold128544:170 279 280 G T 0.6793010250.01266 0.6316 scaffold98263:1069 281 282 T C 0.677597364 0.06875 0.7738C32058675:292 283 284 A G 0.673619414 0.00625 0.6548 scaffold823:11824285 286 C T 0.673537258 0 0.6286 scaffold75287:5899 287 288 G T0.673496392 0.04375 0.7073 scaffold16869:149730 289 290 C T 0.6713873240.1456 0.8718 scaffold3842:543149 291 292 C T 0.670719701 0.025 0.6406C32058613:508 293 294 T C 0.67046277 0 0.5758 C32058613:563 295 296 C T0.67046277 0 0.5758 scaffold7146:70340 297 298 C T 0.669199626 0.16880.8857 scaffold23125:41276 299 300 G A 0.669066022 0.02597 0.675scaffold2418:77508 301 302 A C 0.66841944 0 0.5758 scaffold12000:86305303 304 T G 0.668293303 0.06494 0.75 scaffold24181:60784 305 306 T G0.665305819 0.02985 0.7083 scaffold26621:72993 307 308 G C 0.660893270.01493 0.6389 scaffold6391:16360 309 310 T C 0.658649381 0 0.5952scaffold88759:12655 311 312 A T 0.656740474 0.1824 0.9091scaffold37469:74336 313 314 C T 0.65627789 0.2062 0.9405scaffold12000:86310 315 316 T C 0.655527354 0.06494 0.7375scaffold6143:103796 317 318 G A 0.651968566 0.00625 0.5976scaffold33135:78155 319 320 A G 0.648604326 0.1 0.7976 C32058675:317 321322 G A 0.644770212 0.025 0.6667 scaffold1976:5193 323 324 G A0.644158158 0.01333 0.569 C32098343:2061 325 326 T G 0.643381806 0.2250.9405 scaffold158089:295 327 328 A G 0.642783505 0.01316 0.6053scaffold17267:6243 329 330 T C 0.641890359 0.7372 0 scaffold491:22100331 332 A G 0.640380595 0 0.575 scaffold121522:9070 333 334 C A0.634641583 0.1899 0.9167 scaffold24615:2202 335 336 G C 0.633001060.0125 0.5952 C32052323:699 337 338 A G 0.62920484 0.00625 0.5789C32052323:711 339 340 C T 0.62920484 0.00625 0.5789 scaffold14925:8868341 342 G T 0.626191251 0 0.5714 scaffold133681:2742 343 344 A T0.624116063 0.03125 0.631 scaffold9639:84033 345 346 G C 0.623460076 00.5488 scaffold61482:2893 347 348 T C 0.623250917 0 0.5714scaffold30395:12115 349 350 C T 0.622098398 0.2562 0.95 C32064647:1071351 352 T C 0.621516382 0.06757 0.6935 scaffold2452:1249 353 354 C T0.62106722 0.1709 0.869 scaffold11436:6161 355 356 G A 0.6194842020.02597 0.5833 scaffold4156:16965 357 358 G A 0.61693912 0 0.5476scaffold43435:8325 359 360 T C 0.616228929 0 0.4783 scaffold11297:60144361 362 T A 0.615278463 0.7063 0.0125 scaffold51841:4904 363 364 A C0.612402044 0.02083 0.5811 scaffold38015:40961 365 366 A G 0.6110846710.01899 0.5789 scaffold16206:63417 367 368 A G 0.607275173 0 0.5366scaffold13781:319 369 370 T C 0.603383314 0.02564 0.6143scaffold27023:20610 371 372 T C 0.603215502 0.025 0.5952scaffold4618:83522 373 374 T C 0.600515019 0.03289 0.6094scaffold111383:2928 375 376 G A 0.598738059 0.0443 0.6341scaffold29335:25795 377 378 C T 0.598716088 0.1447 0.8scaffold122455:2010 379 380 C G 0.597062664 0 0.5238scaffold13362:101120 381 382 A C 0.595795194 0 0.5238scaffold38557:23764 383 384 A T 0.595178777 0.01923 0.5scaffold38557:23794 385 386 G A 0.595178777 0.01923 0.5scaffold50091:2544 387 388 G A 0.595100084 0.006494 0.5scaffold16614:72706 389 390 A T 0.594825378 0 0.4483 scaffold4877:4542391 392 A G 0.593079588 0 0.5 scaffold65894:5390 393 394 A T 0.5908077320 0.4857 scaffold90107:10791 395 396 T C 0.590194744 0 0.5122scaffold68873:1704 397 398 A T 0.586079433 0 0.4875 scaffold3842:188176399 400 C T 0.584333199 0 0.5161

TABLE 6 Upstream, Allele and Downstream sequences for SNPs from Table 4SEQ SEQ SNP Name ID Upstream Sequence Minor Major ID Downstream Sequencescaffold 1 CAGATCCTAAATATGCTGATATTATTCTTTTA C T 2ATAAGGAGTTATTATTATTTGTTTTGGGTGCATTGCTG 14566:24841GAGAATTATGCAGCATTTCAGAATAGGTACAT AGTAGAAGCAGTTTCATGCAGCTTGTATGACCTAGCCAATTTCATTCTTTTATTTTTTCTCCTGTATCTT ATGTTGTGCCTACCCTAGCCAAGT TCAT scaffold3 CAATAAGACTTTCGATCGCTCTTGGTGCTGCA A C 4AATGGGGTATTGAGGTTTGAAGAATTCTTTCATATGAA 2257:59436CGAGGCAAGATTTCTATATTTGATATTGGCTT TCTGCAGGATTAACTTATCTTCACACATATGCCGGCTGGTCCCATTTGACAATTTTCCATTAAAATGGCT CGTTATACATAGGGATGTGAAATC GATG scaffold5 TCCTCGATTGTTGAAGGAATTGTGGGGAGGCC A C 6AGCAAACTGCAAGTTCTTGACAGTGTACTGAGGCAACA 123303:7086TGCTTTGCGCTGCTGCTGGGCCCATGATAGTT CAAAGTGCAATTGTCAATTATGATCTGGAAGAGAAAGGTTTGCTGTAGTTTCACATTTTCTGGTTCAACT TTCGCAGCAAAAGACGCAACTAGA GTGA scaffold7 CTGAAGCCAGGACAATGCAGCCAGCCAGTAGT C T 8GGGCAAAGTTTGACGAGCCAGTCTAGCCAATTCGGAAA 21832:18317GAGTATCATCATCAGTATGTTGTTGCTGTCAA TAACTCAGATTATAGTCCAAGTTTTTGTAGAGGCTCTCTGGCATGATGGACCCGAGTCCACGGAAGAGTT CTACAGCAGCATATGCAATGGAGA CAAG scaffold9 CCATCATGAAATTCCCTTACCCTGGTGCCTTA C T 10GACCGACTTGAGCTCCTCACAATGTGGCGCTTTCTACC 10653:22776ACTGCTCTGCAGTATTTCACTAGTGCTTTTGG TGCTGCTGTTATATTCTATCTTTCCCTTTTCACCAATAGGTCTTCATTTGTGGATGGCTTAAGTTCATTG GTGAGTTGCTCCTTCATGCCAATG AGCA scaffold11 CCCATGGAAACAAAAGTTCCCTGATATCAAAC C A 12ACAAGTACTGACCTCAGGAGTTCCAAGAACCCAGTCAC 34968:5203AGTGAAAAACTTTAGCACAACACAGCAGCATA GGATGTGATCACAAGCAGAGCTGGCAGCAGATAAAGCAGATATCGATAAGGAAACAGATTAATTGAAAAG CTTGATAGCTTACGAGCTTTGATA AAAA scaffold13 AACCATTCATCTTCTTATCACTGCTGCCATTT T C 14GTATAGGCCACACAAACCAAAAGATTAGCCTTGTTGTT 41828:12391CTTGTTTGAATGTCATAAATTTTGGATTGGCA TTTGGCAGTAGCTTTCATGACTTTCTCAGCAGCAACTCGGATTCTTGAATAGCATGAACGATCTCATCAC TCACAGGCTCATTCAACAGTTTCA ACGAC32084869: 15 AATGGTTCCTCAATCTTCAAATCATCAGAAGC A T 16CCCAAGGCCAAGTTCCACCTGTTTCATCAGGCCATGCA 1171TCCTTGTTAGCTCGACTCTTCCTTCCACGAAG TTATCAACTTCCCAACAAGTTCCAGTGGCAGTTATGGGCAGCTCCATCAAATGCCTTCTCATTCAGATAA TCCCAATCACCAGCAGCTGCAGTC TAGC scaffold17 CGTACCGTTCATTGTCTTTAGCTAACGACGCC C T 18AGGCATAGGATTGGCTCATTGGCAGCAATTGGGGGTAA 1342:67015AATGAAGCAGCAGCATCCGATCGTTCTTCCAA GCCGAGGTACTCGTCGTCTCGATCGCAGGCAGAGGCAGTGACCCCGTGTAAAGAATTGCTATCTGTTCCC AAACCCGAAGGAGCCAAGAGACAT ATAT scaffold19 GCTCCGCTGCTGAAGATGTTTCCAAACCTGCA C T 20TGCTGCTGAGTTTCTTTTTCATTAGTTGAATTAATGTA 5876:136612CACCAAGGAGTCTCTCGTATCCCCTGATGAGC TCTACAAACTATTTGTTACTGCTAAAAGCTTATTGTCTTTGGTCCTTTTCAGGTAAAACTATCTTGCTCA TTAACTTATTATCACATCTATACT TCAT scaffold21 TCTCTATAATCAACAAATGGGCCATCATGAAA G A 22TGGCTTAAGTTCATTGAGCATGACCGACTTGAGCTCCT 10653:22755TTCCCTTACCCTGGTGCCTTAACTGCTCTGCA CACAATGTGGCGCTTTCTACCTGCTGCTGTTATATTCTGTATTTCACTAGTGCTTTTGGGGTCTTCATTT ATCTTTCCCTTTTCACCAATAGTG GTGG scaffold23 CGCGAAATAACATTCTCACCTTCTCCTCTCTC C A 24ATCGAACACGGAATTCGCAAATACGGTGATTCGATCAA 65043:4386GCCACCGCTGCTGCCGCTGTCGCATACGTTTT TAACGTTTTCGTTCACATTAAGCAAACCGGTGTTGCTGTCTCTCCTCCGATGATGATCGTGATCATAAGG CTTCGGTTCTCTGGCAATCGCTTA CCGC scaffold25 AATAAAACATGAGCTTCTTGATATTCTCACTA T C 26GATCGCTCCAGGTAAAGCCTGCCTCTGTATGCAGCCAA 39548:5676AGTTTGGGTAGTGGCATAGTCTTTGGAGAAGC GTGAGCATAATAAACTGGGGGAACCAAAGAAACTGGTTAGCTCGAGAAAGTGAGGAAATAGTACTTTTGG TAGTACACCTTACGTATGTATAAC AGTC scaffold27 CACTAAGTTTGGGTAGTGGCATAGTCTTTGGA A G 28TATGCAGCCAAGTGAGCATAATAAACTGGGGGAACCAA 39548:5703GAAGCAGCTCGAGAAAGTGAGGAAATAGTACT AGAAACTGGTTTAGTACACCTTACGTATGTATAACAAATTTGGAGTCCGATCGCTCCAGGTAAAGCCTGC GATTGTACACCAGTTTCTGTAGTT CTCT scaffold29 TACCATTTTCTTGCCCACAGGTTATTTTGCTG G A 30TTCCAGGATATTCCTTTGTGCCATTTAGTAAAGTGAGT 17605:9224CTATCCTTGCGTGGTTGTGGCCCACTCTGGCC CTAAAAGCAATGCTTATCTTTAAATATGGTATCTGGTGAAATCGGTTATCCTCATCAACAGTGCTGGAAA CAGCTGCTGACAACTGAAAGCTGT TGTC scaffold31 TCAAGAGTTGGTTAGAAATTTATAACATACCA C T 32GTTGATGAAGTTTTTCGGCTCACATCTTTACCTGGCTG 94301:10558GAAGCAGCAGAAAATGCTGTACCGCTTCTTTT AGAGCCAGTTGCAGCATCTGTTCCTCCTTCACGTTTTTCTCAGCATCATGATGGTCTTTTGTAGTTTTTC GTCGAGCTTGTTCAGCCATAAGCT CAGG scaffold33 CTAGCAGCTTCAAAATGGGTTTCTTTCTTTGG G A 34CCACCAACCAATTCAGCAGCCATATTGATCAATGACAA 52608:15151AACAGCTGGGATATGTCCTCCGAGGCCAATCT AAGTAACAAGTTCTTTGTTGAATCTGTGTAGGGGTTACGTCAAACAGAACTTGAAGTGAAGCAGAGAGCA GGATGAGAATGAAGAAAGAAAGAA AAGC scaffold35 CCCAAAGAGTTTGGTTATTTCCTAGTGTTACG G A 36GTTTGTGAAGTGGGACCACTATTTCCTGCAGATGAAAC 9110:12755ACTGTTGGAAAGGCCCTGTCTATTGTGGTGGC AACCGTGATCCCCTTTGATGCTGCATGGAAGGACCCAATGCCACAGTTATGAGCCATGGTGCAGTGTTTG TTGCAATTGTGTCACGCAGATCAA AAATC32099389: 37 TTGATCTACTGTTGCAGAACATGTATGCTCAG G A 38CCACCTCCTATGGGGGGAATGGGAATGCCTCCAATGCC 1875TTGCTGCCTCTTGCTTTGCCTGCCCCACCTAT ACCCTATGGCATGCCACCCATGGGGAGCAGCTATTGAGGCCGGGAATGGGAGGACCAGGAATGGGAGGCT ATGTATCAGGACTACGAGTTGTGA ATGCC32076905: 39 CTGACAATATAGCAATGCTTCATTAAATCTCA G T 40CAAAATTCAAACCACAAGATCACTCTAACAAGTCACAA 769TTATTACAAATACACTCATAAGCTCATCAATA CAAACACATAAAAGCATCAACTCTAGCAGCACCACCACTAGATAATAGATATAGTAAGATACAAGCTGCA CACCAAACCAAGGCATCATCCCGA ATAC scaffold41 GACTAAGCCTCAACTGGGTTTGGAAGAATTTT T C 42TTAGGGAACCCAGTGACCTCGAATTCTTCAGACTCTGT 50412:1217GACCATGAATCTGAACCGTCGGCTGCTCTCAA TTTCCGGTGAGGGAGTTCGTTCCAGATAGAAATGAAGGAAAGTAATGTAACTGAGAACGATGGAACCGGC CTGCAGAGCCGTAAGCGCCAGAAG ACCG scaffold43 ACAAGAAGAATAGAAAGGCTGCTCTTACTATT T C 44ACTTACTGCAAGCTGCTTGATGAGGTGAACAAAGAGAG 73281:3231TTCTATGCCTTGGCCTTGGCTGAAGCTCTGTT TGAATTGGGAGGAGCTTCTGGTATGGTTTCAATCAGAAGTTTCTAATGGAGAAAGCTTATTGGGAATGGA GGTTCTTCTATGATGCCTATTCGA AGGT scaffold45 CCAATTCCGCCACTGCTCCACTGCTGCTTGCT T C 46GTTTCAGCATCATTACCATTTAATTTATCCTCTGGTAC 60591:9229AGATTCTTCTCAGATGAGCGCTCAATCTCTTT TGGGCTGCTGTCTGGACCCCATTCTGATGAGTCTTTAATGACAAAGATCCACCAAGCTCAGGCTTATCAT CATCTGCTTCAACTTCATGCGAGG CACC scaffold47 GTACTGGGCTGCTGTCTGGACCCCATTCTGAT G T 48GTTTCTAAACTAGCTGCAGTAGCATCCATTTGATCCAC 60591:9364GAGTCTTTAACATCTGCTTCAACTTCATGCGA AGTTTCCAAGGAAGATCCAGATCTATTTATAATTCCACGGTAGATGTAGATACCACAGATTCTGACATAG CAGTGAGGGGTGCATCCACATCAT TTTC scaffold49 TCCATTGCAAAAACTACCACCATGGCTCTGAT T A 50GATCAGAAGATTTATGAGTTGCTTCTTCAGAAAGTTGG 27976:11462CAGCAGCAGCAAAAGAGCTAGGCTTCACTTCC TAACCTTGTTCCGAGTGTTGACTTGAGGTATTTGCTGCTCCACAAGATCAAACTCTTTCTTCTCTTGCTT AAATGGAGACTTCTCTCGCTGGCT GGTT scaffold51 AAGCAGCATGGCGGCGATACAGTAAGAAAAAG A G 52CCTAGTTTAGGTGCCACCATTTATGCTTCACGCTTTGC 4591:18700CTTGAAGAGTCTCTTAGGGAAGAGGAGAATAG TGCAAACGCACTGCGTGCCATAAGGCGTAATAGTACACGTTGCAGGATGCGTTGGCCAAGGCCGGGGCGA GTAAAACAAGGATGCCCGAAAGAA GCTC scaffold53 AGGAGGCAACGGACTCTTCAACTTAGCAGCAG A G 54AAGATTTCTGCTTCCGCGGATGTCGGATCTTGGCGGTG 96873:16326CGGAAGGCGAGGGTTTAGGAATCGGAGGCAAG GTGCTGCTAGGGTTTGGAGATGGCGGAGGTTCCGTCTCGAGTTTGGATCGGCATAGGGCGGAGCATTCTC TCTCAAGATGTTATTTCTTGGCTG CTTG scaffold55 ACCATTATGCGGTGTGTGCAAATTCGAATTGC C T 56CTTCTTCATCTTCAGAACTCTCATCATCATCCTTGAAA 6777:12771TCTCCCCATCTGAGTAGTCTGATATCAAGTGA ATTGACCCTGGTGTTAATCTTTTGTTTTGTTTTGAGCTTCTGATGGACTTCTTGTGCTGCTATCTGAACC GTTCAAGATTTCTCCATTGTACTT CACCC31894837: 57 ACTTCGGCTCTAGGCAGCCCTGCGCTTCGCAT A G 58GGGCTGCAAGAGTAGAAGTCTTCAGCGATTCACAGTTA 130TCCTAGCCTCAAACAACGAAGCCGAGTATGAA GTGGTCAATCAAGTGTCGGGGGAATACCAAACGCGTGGGCCCTAATTGCAGGACTAAAGTTATCAAAGGC AGAAAAAATAGCCGCTTACGTAGC CGTA scaffold59 TGTGTTCAACGCTTTCCGGGTCACGCCCCAGA C T 60GCCACCCCACCTTCCTCAATAATAAGCTTACCGTACCG 1342:66883CGCCCAATTGCCGTGGCAGCATTCTCCTGACC TTCATTGTCTTTAGCTAACGACGCCAATGAAGCAGCAGTTCCATTCGGCCTTCTTTTGCCAATTTCAGAA CATCCGATCGTTCTTCCAATGACC GTGGC32035477: 61 GAGAAGAGAAATTATCATATGCCACCATGAGC G A 62GCAAACAGTGCTTGTTTCTCCCATTCGTTTCGAACATT 264AAAACCGCAGCCTCCTTTGCGAGGATAGAATT TAGCATGGTATTATCACTAGCAGCTTTTTTGGAAAAACTTCTACCGAGAGAGTTTTGTCGGTCGCTTTCT TAAGGAAATACTTCAAGATGGCGA CCAT scaffold63 ACTTGGCAATGTTAGCAGCCTGGTCCAGAGAA T C 64TCAAGGTCAATGGACTCCAGAGCAGCCTTAACCATGGC 6360:1717CTAGAGAGCAAGGGATGGAAGCATATATGGTC TAGGACTGAGGTTCTCAAGGGGCATGGATCCGAAGATGCACAATACCAAGCCGGGGATGACTGCCACTGT GTTGTGGATCCAACTTGGAAACAA GCAA scaffold65 GTAAATGCAGTCCTTGTGATTGTTGTTGTTGT T C 66CGACCAAATTGCTGTAAAAAGAAGTCATGCAACAGAAG 132623:4144AAATGCCCGTCGTGTTTCAGTTGCTCGTTACC TTGTTGCTGCCTATTCCCTACTTCTTGTTGTTCATGCCAAAATGGCGTTGCCCTCGTTGTTCATGTTCGT CTGATTGCCCCTCTTGCAGATGCA GTCC scaffold67 ATTGCTCTAGGGGCAGCCCGAGGCTTAGCTTA T A 68TGACTTTACACCTAAGGTCTCAGACTTTGGATTGGCCA 72006:11851TCTACATGAAGACTCCAATCCACGAGTGATTC GAGCAGCATTAGACGGTAACAGACATATCTCAACACATATCGAGATTTCAAGTCCAGCAACATCCTGTTA GTTATGGGCACTTTTGGGTAAGGA GAGC scaffold69 AAAAAGTTGCAGCGGCAACAAAAGGTCTGTGT C T 70GGTGGGTTGTTGGAGCCAATTGAGTATATTTCAACGGA 6742:34122TGGGATTGAGAGGCTTTGAGAGAAGAGACCCT GGAAGTGTCGGAATCGGAGTCGAAAAAGGCGATGGCTGCGAGGGAGAAACCCACGAAGACTGAGGAGCTA CGAACCGGTCCAAGGCCGTGACTG GGTT scaffold71 AGATAAAAGTCGCTCATTCTGAACCTCTTCCT G A 72ATTCTCTCTTAGCTTGTGATGCTTTTACTTCTTCCTCT 3108:8782CTAACGAAGCTGCTTTTAAGAGCAAGGTGTTG GCAGCTTTCTTTGATTCAGTAGCAAGAACCATCTTTATACCTCATTTTCAACTACTCTCAATCGAGACTC TTGCAGCTCCTGCAAATCCATTGC TGCC scaffold73 AAGCAAATTAACAACCTGCTTATCAACCAAAC A G 74AGGCTCCACTAGCAATTGATCTTCTTGCACCCTTCCTC 11225:9528CCGGATGAGTAGTGACTACATAAGGGTTATTA ACTAAGTCTTGAGCAGCCCATATACTCCGCCAAATAAATCATGGGGCAGCCACGGGTCATTTAGAATGCT ACTAGGATTATTTCCTAACTCAGC AACC scaffold75 CAACCTGCTTATCAACCAAACCCGGATGAGTA T A 76GCAATTGATCTTCTTGCACCCTTCCTCACTAAGTCTTG 11225:9539GTGACTACATAAGGGTTATTATCATGGGGCAG AGCAGCCCATATACTCCGCCAAATAAAACTAGGATTATCCACGGGTCATTTAGAATGCTAACCGAGGCTC TTCCTAACTCAGCATCAAGAAAGG CACT scaffold77 TGCCGGATTTAGTGTGTAATAAAGACGAATCG A G 78TTGAGGCTTAAACTCATCACTGAGCCTGCTGAGACTGC 2579:59858AGCAGCTTTGGGTTCAAGAATTTGATGGAGAC TTCGTCGTCTCAGCTTCAAAATATGGCCATTAGAGTTGTTTCTGGGTTGACGTTCAAGAAGCTGAAGAGA AATTGAGGAATGGCTGCGTCGGAT GACC scaffold79 GCACTTCCATTTACTTGGCCTCAGGCAAAATA T C 80GTATCCCCTGATGAGCTTGGTCCTTTTCAGGTAAAACT 5876:136559TACCAGGAAGAGAAACATCTAGCTCCGCTGCT ATCTTGCTCATCATCTGCTGCTGAGTTTCTTTTTCATTGAAGATGTTTCCAAACCTGCACACCAAGGAGT AGTTGAATTAATGTATCTACAAAC CTCT scaffold81 CATCTAGCTCCGCTGCTGAAGATGTTTCCAAA T C 82ATCATCTGCTGCTGAGTTTCTTTTTCATTAGTTGAATT 5876:136606CCTGCACACCAAGGAGTCTCTCGTATCCCCTG AATGTATCTACAAACTATTTGTTACTGCTAAAAGCTTAATGAGCTTGGTCCTTTTCAGGTAAAACTATCT TTGTCTTTAACTTATTATCACATC TGCT scaffold83 CTAGTTTTAAGGATGCACAGTGGCTCCTCATC A G 84ATAAATGACTTCAATGAGTACTATCTTAGTGATGGTGT 23386:12397CAGAAAGCTGAAGAGGAAAAGCTTCTTCAAGT TAGTAAGTCTGCTCAACTGTGGAATGAGCAGCGAAAGTGACTATTAAGCTGCCAGAGGAGAAACTAAATA TGATATTGCAGGATGCTCTTTTTA AGTT scaffold85 TGCAGCCCATTGTTGAGATGAATGAGATGTTA A G 86AAGCAGCAACTTCTGTGATATCCAGTGGGTTTACTAAA 93032:4944CATGCTTGAAGTTATGTTGATGTCTCATTTCT ATAGCGCCAGCACCGAGTGAATTTACTGCACCTGCACACTTTCCTTAGCTATCATAGTCAAAGCATCACC CTAAATTTAAATGGAATAAACAAG AATG scaffold87 TGTTAACAGCATCAGCAGCAGCAGCAGTATTA A G 88CCAATATGGTGGAGCTGCAACATGGGGATTAGTCAGAC 94004:13663TCCCTATTTCGTAAGGAACAAGGACGCCACAT ATAGCCAATTCCTTAATCCACCTCAATGAAAAAGAAAAATATGTTACAGGAGCGATCGGCTTAGAAACTG GGACAGTAACTAACCATTGCATTA GTGC scaffold89 ACGACGCCTACTACCAATAACCCAAATAGGGC C A 90TACTACTACTACTACTACTACTACTACTGCTGCCTTCT 26621:73038AGCAAGGATGATGATGATGACCACACAATCCT TCAGGGGTAGCACCATGGTGGATAAGCTAATCAAGGCCGCTCCCCAACACTCTCCAGACCACCCAACACA AGAAGATCATCATCAACAACACCT TCCA scaffold91 CCCAAATAGGGCAGCAAGGATGATGATGATGA G A 92CTACTACTGCTGCCTTCTTCAGGGGTAGCACCATGGTG 26621:73058CCACACAATCCTGCTCCCCAACACTCTCCAGA GATAAGCTAATCAAGGCCAGAAGATCATCATCAACAACCCACCCAACACATCCAATACTACTACTACTAC ACCTACCTACATATTATAACACAA TACT scaffold93 ATGATGATTTAAAATGCATTAACTTCACATCC C A 94CTCGGAAAAGGCTGCAAAAATGATAATCATAACTAGCA 62259:15201CAATATTTCTCACCAACCCATAGACTCATAAC TAAAATATTGATCTTAATTCATAACTAAACCATATATTTTTAGCCTTTCCCTCTAAATATTGGATTGCAA TATAGAATCAAAACTTTTCAAACA ATTT scaffold95 CAAAACCATCGCCGGAGAGAGAGAGGTTGTTT A G 96AGGCAATCATGGAAGAGGAGGCGGAGAGTGGCGGCAGC 38801:13810TCAGCATCTCGCTCAGCCTTGTTGAAGGCATT TGTGGTTGGGACCACCGATTGCTTTTCCTGTACAATCTGGAGGCGATGAGGATGGAGGCGTCGCAGCCTT GCCGTACGATGTCGGGGAATTTCG CGAC scaffold97 TCGCCTTACAGGCTTTACTTGCACAGTCAAAG C T 98CTGATTGTTGAGAAGTTTTTGTACTCGTGAAATAGAAG 6550:117010ACCGGTCTCAGCTCTTTGCATCGTGGGCAGAT GAGAGGCCGCTGTGGCTTTTGCTGCTCTTTCGGGTTATTTCGATGTCTATCGTTGGCTGGTCTTTGTAGT GTTGAAGACACGAAGAGAGTGTTT AATG scaffold99 TTGGGTGGTGGCTATGGCTGCAGGCACTAAGA T C 100TTCTCCAATCTGAACGCCACTCTCTCCGATCTCGGGGC 94863:28465GCAAGCCCACAAACCAAGCTGTTGAAAGAAGC TCAGATTCGAGAGGGAGATAAGTGGTTTGCAACGGCGCATGCAGCCCGTATAATGCCACAGACTTTGCTG AGCAGGCACGGGGATCAGACCCTG ATTT scaffold101 GAAAACGAATGCTGCAAATCGATCGGTGAGCC T A 102ATAACAAATCTTGGAGCTGCATTGTGATGATGATGATG 130551:553TGTTATGCAACTCTTCCTTGTTTATTTGATGA ATGATGATGATGATTGATAGTCATGGTGTTATGTTTTTTCTTTTTAGTCTCTTTTCATACATTATTGTAA CACAATAATAAAAAAAAGAAAGTC TTGT scaffold103 AAAGCCTGCTGCGCTTCCCGGGCTCTTGAGGA C A 104ATAGTTACCATAGATTTTTCTGTCAATACTTTTAAATC 21832:579GATTCTTCACTTACAATTTGCTTAGCTAATTC GATACCCGGATAGCATTCGCAGCATTCTAAAATCTTTTTTGTAAGAGATCATGCATCCATATTTTGTCAC TCACCCGAATTTCGCTCTCTCCCT CAACC32090201: 105 GATTATACCCAGAAACCAAAGCAGCCTTAGGA C T 106CTTAACCAAGCAGCCAGAACAATTTTGGTGTGAACATC 1470CATTCCAGATTTCGTTTACAACAATCCATAGC AACAGCATGCTGCCTTGCTGACCGGAGACTCCGCCGGACGAAGTACCCAAAAGCTCATCTTCCCTTCTCT AAAATTTGGGATCCGAGAGTCCCT CAAA scaffold107 GATTCGAGTCATAACTTGGCACAACTAAACTA C A 108GAGTAAACACTTTTCCTCTGTTCATCAGACTGACTCTC 117639:1939CAGTCCTGACAGTCGGATCCACAATCTGTGAA CTGCAGCATTAAATTACTCAATTCTGGATGATGCAAAAAAAAGCAGCAATATCTCGCCCAGATATAAAAA TCAACTTGAAACAGACTAAGTTAT ACCG scaffold109 AGTCCAGTTACAACAATGTTTGAGCCTCAAAA A G 110AGCACCAGAAAAGAAGCTTACCCTCTTTGCTCTTAGGC 73281:2531TAGTATAGAGAAAAGGGACAATAGTAGCAATG TTGCTGTTCTTGAAAAAGCAGCAACTGGCCTTGGAACACAAATTCTGGCTCTGCAGCCTTAACAAAAACA CTTGGTTTTATCTGGGCAACAGTT GTAA scaffold111 GTTTTAGATACCAAAAATAAAAGATGTAGATG T C 112ATCTCTTTTTTGGCTAATGCAGCTATAAAGCATGGGGT 95390:2586ACCGGAGGAAACAAGCAGCTGGGTCTGCCAAT TCTCTTCTGGGACAACAGTTATCAGAAAGAGACATTATTCAAAACAGTCTCAGTTGCATAATTACTTGTC TGTGGCGTGCATAGATTACAGGTT TTAA scaffold113 CTAACTCAATATGTTTCATTCTTGCATGATAA C T 114TGTTGTAGAAAGAGCAGCAGCGAACATAAGTTTTAAGT 109105:3417ACAGAACTTGCAGCTGGTGCACTAGCACTCAT TCAAGTAATTTGTTAGTCAAATTTCAAAGTGACACTTCGTTGTCACACCATGTTACTGATGTGCATAATT CTTTATTGGAAAGGAAAAGGTACA GTGT scaffold115 CCCATGGCTTGGACGAGCTCCGACCTGTGATC T C 116AGGTGCCATTTCCGTCACTCCCTCACTGAGCAGGAGAA 9670:13701AGACATCCTCGTTACAAGACTGAGATTTGCAG GTTGATGCTGCCTCATTGATCACCATGGAAGTGAAATTAATGGTTCTGGCTGGCGATCCCTGCCCTTACG ATACCCTTTAGCAGGCAGATTAAG GCCA scaffold117 GTTGTATTTGCTGCCACGATAAAATTGCTTGG C T 118GTGGCTCAACCTTTGCTCTATATTCAGTATTTTGGTTA 45478:20587ACATATCTAAAGTAGACTCTGACTTGCTCCCT CAGCTGCAGGAATGGTAGTCATATCTCTCAACTGTTCTGATTTTGTCTCCCGGCCAACTTTGCTTGCAGA TCCATTGCTGGAGCAGTGAAACTT AACA scaffold119 TCTTTGTCTTTAATCATATCAAAGACTTCTTG C A 120AAGACCAATCTTAATTGCAAAAGCATGGACTTCCAGCC 23700:29096AGCAGCTTCCAATTGCCCACACTTGGAATACA CTTTGTTGAGTGATCTTAGAGATGCACAAGCTGAAGCTTGTCAACAAGTGAATTTCCAACTAAAACATTA GCACTTGCTATAGTAACTGCATTC TCTA scaffold121 GACATAATGAGTAGACAATGAAGGTATTAAGA A G 122ACTCTCACATAGTGTTGCAATGATGGATTCGTAGCTGC 10732:30120TGTGGGCCTATTCAATAACAACGGAAGCAAAA AAGGGAATAGAGATAAAACAAAATTTTTAAAGGAGATTTGAGTACCTTAGCTTCAGGTTCATCTAAGGTG TAATAATAATAATAATAATAATAA TCTA scaffold123 AATTACAGGAACAAGGGCTAAAGCCAGATCAC T C 124GAAATTATGAGAAAGGAATTCAATGTAATGCCAGGGCT 16969:31933ATCACCTTTTTGGGTGTTTTAGCTGCCTGTAC GCAGCACTATGCTTGCTCGGTAAGTCTGTTAGGTCGCGTCATGGAGGTCTTGTTGATGAAGGAAGAAAAT CAGGCTTGCTTGACGAGGCATTGA ATTT scaffold125 TACCCCATCATTATATTGATTTTGATGATGAG A G 126TATCGAAAAGCTTGGCTTTAAGTGCTCCTAGTAAGCCC 3884:40695GTAGTGGAACAGGGAAGAGTATGGCAGCGGCT TCTGAGGCTGCAGAGCTAGTAGGAGCACACACTTGCTCGCGGCAGCCTTGGTGGTGGGCTTGGCATCGCC GAAACAAAAGGGCTCCAAGTTATT ATTG scaffold127 GGTGGCCTTAATGCCTGGTTCGGGAACAGGCT A G 128CTGCATCCCACCAACATCCCGGTGGTTGTGGAGATGAT 829:52127GCAGCAATGACAGCCATAGCCACTTGCTCAAC ACTAGCAGTACTACTAGTCATCGGTGACTGCAGCTACTGCTTTTAGGCCGATTATTAATGAGTCTCCGGT ATATATATATATATATATATATAT GACT scaffold129 CCATTTTCGCCTTACAGGCTTTACTTGCACAG G A 130TAATGTCTGATTGTTGAGAAGTTTTTGTACTCGTGAAA 6550:117004TCAAAGACCGGTCTCAGCTCTTTGCATCGTGG TAGAAGGAGAGGCCGCTGTGGCTTTTGCTGCTCTTTCGGCAGATTTCGATGTCTATCGTTGGCTGGTCTT GGTTATGTTGAAGACACGAAGAGA TGTA scaffold131 TGTAGGCTGGAGTTTCTGAACATGCAAAAGCA A G 132GACACTTCTGGTCCTTCACTTAATATTCTCATCAAACT 27604:1398TTAGGGCCAAAAGGGTCAGATGCCCACAAAGC TATGGCAGTTGAGTCTTTGGTCTTTGCTCCATTCTTTGAGCTGTAATAGGTGACACAATTGGGGATCCCC CTGCTCATGGGGGCTTAATCTTCA TTAA scaffold133 TCCTCCGTTACTTGGATCAATTCGGTAAGACA G A 134AGGCAGGCATGGGGCAGCCTAGATGCCCTGATAGGGCG 125644:4761AAAGTCCACACTAATCTCTGTCCCTTTTTCGG TCTTAGGGCAGCTTTTGAGGAGCATGGGGGAAAACCCGGCACCCGAATCCCCCAGGTCCATGCCCCTGCC AGGCGAACCCTTTCGGGGCTCGGG CGCT scaffold135 GGGTTCAAGCTCCATCAAGACTTTGGCTGCTC C T 136CGTTTCCCTTTGCCTTAATTAGCAGCTTCTCGGCTTCA 16027:10666GTCTTCCCAGCTTTTCGTTCTTATGTACCCTG TCTAGTAAGCCTGCACGGCCTAAGAGATCAACCATACACAACTTCTCAAAAACGAGCTCCACATCATCGT TGAATAATGTTGTCGATCAGGGCA GTCT scaffold137 GAGAAAATTCTATAGCTCACAGGCTTTAGGGG G A 138ACACAATGGGCTTGGGCTTTAACCCTTTCACCCAACTC 2502:17437GTCACCAAAATGCTCACAAGAGAGAAAGAGGA ACTACACCTCGATCTCTTGGAGTCCAGGCCCACTCTCTGCAGCCAAGAGGTACCACTCTCATAGAATGAT CGTTCACAAGCCCAATAGCAGAGA GATG scaffold139 CCACTCTCATAGAATGATGATGAACACAATGG C G 140TTCACAAGCCCAATAGCAGAGAAGTAGCGTCCTCTGCT 2502:17515GCTTGGGCTTTAACCCTTTCACCCAACTCACT GCTATGGTTGCAAGATTTAGTGATGCTGAAACTGGATTACACCTCGATCTCTTGGAGTCCAGGCCCACTC TGGGTCTGGGTCTGGGCCTGGGCC TCTC scaffold141 AGCTTGTCTCCCCCACGGCAGCCACTAGCCAG C T 142TCCTTCCATCCGCCATTTTCAAATTATTTTGATGCTGC 70502:1951CTGCCACTTGTCCGCCTCTCGTTCAATCCGAG CAAGTGTCCCAATGCATGAAAAAGTACTCAAAATGTCACTCCAGCAGCCTCCCATGTGCGCCAAGTGTCG ATTTCAAAAATACATTACAATTTT TCTC scaffold143 CTCACTTTTCGAGGAGATGCTTACCTATCGAA T C 144CGGGCTTTGATACTACTGGCTCTCTTGAAACTCGTACA 40620:28184ATGATCATCGATGCAAAAGTAATGGCTTCTAC AGAGAACTAGCTGCGTTCTTTGGCCAAACTTCTCAGGAACTTACGACGCCTTCATAACTGCTGCTAGAAT AACCACAGGAATACATACTGGTTC TTTT scaffold145 GAGGAGATGCTTACCTATCGAAATGATCATCG G A 146TACTACTGGCTCTCTTGAAACTCGTACAAGAGAACTAG 40620:28194ATGCAAAAGTAATGGCTTCTACACTTACGACG CTGCGTTCTTTGGCCAAACTTCTCAGGAAACCACAGGACCTTCATAACTGCTGCTAGAATTTTTCCGGGC ATACATACTGGTTCATTTATACAT TTTG scaffold147 TGCTTACCTATCGAAATGATCATCGATGCAAA A T 148GGCTCTCTTGAAACTCGTACAAGAGAACTAGCTGCGTT 40620:28201AGTAATGGCTTCTACACTTACGACGCCTTCAT CTTTGGCCAAACTTCTCAGGAAACCACAGGAATACATAAACTGCTGCTAGAATTTTTCCGGGCTTTGATA CTGGTTCATTTATACATGAAATGT CTAC scaffold149 CGAGCTCAAGCATTTCATTCTCTCGATTCTCG C G 150TTCATCCACCGACACCGGCGAGGCAAGCCTCCTCATTC 118158:666ACGACCCGATTGTCAGTCGAGTTTTTGGGGAA ATTAGGGGCCGCTGCCCACCCATGTTCCTCTGTAATCTGCCGGATTTAGGAGCTGCGATATCAAGCTAGC TACTGATTCGGATCCGGGTCTTCG GATT scaffold151 CAACGCTCTTCTTGGCCATGGAGACAACTCCC T A 152AGAGGGTCGGCTGCTGTGAAGCCGAGGCGAGGGAGTGT 41951:881CTGACCGGAGTCCCACCACCATTGGCGGAGCC GGTGGGGCGGAGACGGAGGGAAGAGGGTGAGAGTTGAAGAAAGAGCGGAGCGTGGAGGCGACGGCGTGGG GGAGAGAAGAACGAGTGCGGGAAG AGAA scaffold153 CCTTTGGGGAAGCGGCCGAACCATGGCTCGGT T C 154GGTGCCTGTTGAGGAGGGGTTGATGAGTTACCAGAGCC 95666:9974TTCCCTGGTCCATTTTCTTTTTCGGGTGGGGA CGATGGAGGGAGGAAAACTCCAGTGCCAGGAACAGGCAAACAGTGTCAGCAGCATGGTTTGTTGTTTCGT AGCGAGGTGGAGGATGCCTCGGTG TGCTscaffold12 155 CATTGACCGCCATGGATCCCCATCAGCTGCTT C T 156GGCGGTTTCGATGATTACGGAGCGTGGTATGGGAATAT 645:86648TTGGCAGTATCCCCTTTCGGCTCCGCGAACCA CCAGTACTTGATTAACATTTCGGCGATTGGGGCATTCTAATCCTCTCTCCTCCTCCGTCATCTGACGATG TCTGCGTCTTCATATTCGTCTTCT GAAG scaffold157 AAGAAGGAAGAGGTTGCCATCCGGGCATGGCG T C 158CTATCTCTCTTTAGTTCACTGTCGAGTGGAATTTAGTA 6627:26364AGAAGCGCATTAGCCGGGTTGATTGAAGCCGG GAAGAAAAAGTTGGCCGACGTACAAAGAGAAAAGAAGAATCAATACTCGTCGCTACTAGCAGCGTCACCA GAAGAGAGTGGAGTCTTATTACAT TCTCC32050599: 159 GAAGAGCTATCTTTAGAAAGTAATGACAAAAT T C 160AAAAGCATAAAAAATCAACACATTCCAATACATTTTGC 443CTCTTCAGCAAGTGAATTTAGGTTCACAGCAG AACAATTCAATCAAAACAAACACCAAAAGAAAATTTGTCAGAAAGCCCAAAAAGAACCTGATAATTAAAT TGTTGCAGCAAAAGACACAATGAA GATT scaffold161 GTCTTACATTTTATATTCTTTTTCAGATATGG A G 162ATCACCATTCATCCAGGAGTAACACTTGGCCCTCTCTT 20861:14886TATGCAGCTTCAAAAGCTAGAGCTGAGAAAGC GCAGCCCATTATGAATGACAGTGTCAGTCTCATTATGATGGTTGGGAATTTGCAAAAGAGAATGGGATTG ATCTAACAAATGGTACACATCATG AATT scaffold163 AATGGATCCTTTTGCGGCCTCTGCAAGTGAAA T A 164TGTCACTGGACTCAGCTGAAGCTCGATTACCCTGCTGA 30119:28969TAACTTTCCCAGCTTTCTTCTTTTTGTTGTTA GAGGAAGCTGCATTCTTTTTAGCTTCTGTCTGATTACTTTATTATTACTTTGGTTCCCGTTTTGCATAGC GGGGGCTGTAACAACTTTTGGTGT TTTC scaffold165 TGCTGCTTCTTAAGAGTTTTTTAACCCCAATC T C 166AACAGCTGATCATCAACATTGCAAATCAAGCAAAGGCT 2257:75397GCCCCGCCCCTGCTCGACAATTAGAAAACCAC GCAACGTCTAAAAGGCCATTGTGCCTAAGCAGTGCCTCACAAACAACCCTGCAACTCCATTTCACAGCTC AGGTGAAGGTTCGCGGCCTCGGAA CATG scaffold167 TTAATTTGAATTAGTTATGATTTTTTGAAGTT T C 168GAGCGCCTGTCTGAACCGCTCATCCGCGCGTGGAAATG 46867:905TCGGAAAAATCAGAATCGTGCAACAGGGCCAC CTGCAAACTCCTCCCTGCGCCGAGGTTTCTCGGCCAAGGCGCGAGGACAGCACGCATTCAGAAAAGCTGC CACTCCCCTTGCGCACGCGGAAAT TCGA scaffold169 GGCTCATGCCACTGTTGCTGCCACTTAAACTA A T 170CGACTAGTGAAATAACTGGTATCTAGTGCACTCTCCGA 65132:21260TCATCATCGCTAGAATCTTCAAGTTCGCTTGG CGAGGGAACAAATGCAGCCTGCACAAATAATGAGTCGATGGATAACCATGGTCATCTGAAGTATTCCATG GTTAATGTATAGGAAGCGCAACCA AATA scaffold171 CTTCAAAACTAGTAATAATAATTATTGGGTGG C T 172AATGCCACAGACTTTGCTGATTTCTTCTCCAATCTGAA 94863:28441TGGCTATGGCTGCAGGCACTAAGAGCAAGCCC CGCCACTCTCTCCGATCTCGGGGCTCAGATTCGAGAGGACAAACCAAGCTGTTGAAAGAAGCATGCAGCC GAGATAAGTGGTTTGCAACGGCGC CGTA scaffold173 GCATGTATCCAGACCCAAAGCTGGCGAAAACA T C 174ATTATCCCTATTTCGTAAGGAACAAGGACGCCACATAT 94004:13590AGACCCTAAATCTGTAGTCTCTGGTTTCACAG ATGTTACAGGAGCGATCGGCTTAGAAACTGGTGCACCAATCCTAGACTGTTAACAGCATCAGCAGCAGCA ATATGGTGGAGCTGCAACATGGGG GCAG scaffold175 TCTGTAGTCTCTGGTTTCACAGATCCTAGACT T A 176TACAGGAGCGATCGGCTTAGAAACTGGTGCACCAATAT 94004:13632GTTAACAGCATCAGCAGCAGCAGCAGTATTAT GGTGGAGCTGCAACATGGGGATTAGTCAGACATAGCCACCCTATTTCGTAAGGAACAAGGACGCCACATA ATTCCTTAATCCACCTCAATGAAA TATG scaffold177 TTTTTGATGTTAAGAATTGTACGTTGAGACAT T C 178TGCTTCTTGCTGCCAGTTTCTCAAAGAGGCTCATAATG 39420:9067GAGAACTAGCAGCGAGTGTTTCTCACCTGCAT TTATGATAAGCTTTTGCAAACATAGAGAAAAATAGGATCATAGTTTTTGAGAAAGTCAGGAGACAAATTA GAAACAGAGAGTTGTATTAATAAT TGGT scaffold179 TGAGTTTTGCAGGCGCAGCTCGGCCTGATGCA C T 180GACTGCAATTCAGGTGGCTGCAGCCAACCCGAGTCTGT 42291:6484CCAGAGAACATAAGGTCCATTCTCATCAATCA TATCGAGCTTTTCATGGAGTCTGAATTCTGTTTGCAGCGTGCACCTCAGTAGATGATCAAGACTGTCAAT CACCAGGAGATAGCCCAACTAGAA TTCT scaffold181 GAACTTGATCAATCCCATAACTCAGTCCTCCA G T 182GACATGTTCTTCCGCTGAATGAGGCACCCATAACTCTT 23828:34435TTGTCATGGAGCCAGCGGCTCTTCTCCCTACC GCAAGTATAAGGATCCCAAATGATGCTGCTGTCCGGCGCTGCAAATCAAAAGAGTCTTGGCTGCCATAAA GGATTGACCAGAGGCTGTCTGGAA ACCC scaffold183 TGCCTAATTTGTTGTTTATTTGTCATTGGTGG C T 184GCTCAGGCTGAGAGACACATCCCCGCATCAACACCCAT 15017:4539CAGCATGATGCTGCATGCCGTGTGATTGCAAG TACAGTAAATGCGTCTGCAGTTAGCAATGGAAGAACAGACTAAAGAAAGAAAGAGATGAAGCAAGATCAC GTCCTTTTTGTTTTATATATGCTG TACT scaffold185 GGCCGTTCTTGGGGACGACCTCTCCGTCCGAT T C 186CCAATACAAAATTGGAGAGCCTTGGTGATGAAGATGAA 16607:2589TCGCCAAACTCAGAGCCTCCTTGTCTCAGTCG GAGGAGGATGAAGATGATGAGGTTGAGAAGTTGATTAGCCTGCTGCTTCCATTTCCGGCACTGGTGGCGT GTGGGCCAAGGACGCTGCTCGCCT TCGT scaffold187 TACATACATACATACACAAATAAATAATTATA G A 188GCGTCTCTGGCACCTGCAACCATTGCAGCTTCGGTCCA 27758:4907AATAGTAATACCTCCACTAGAGCAGCCTCTTT CCCCAACTTAAGCTGCACCAACTCATCCCAAACGCTACCCTGGGAAGTGACCCAATTATGGCGGGAGACA AATTAAAAATTCACCCCCAATTTT CACC scaffold189 TTTTAGGTGTTTTGAAACAGTCTCAGAATCAG G C 190AGTAATGGAAAGTTGGTTTCTATCGATGTCTTCGAGTT 20809:7695CATACGAATTTTGTCTTGCTGAACACTGATGC TGTTTCTGCTGCTAAGAATTACTGGTCTTCAGAAGTTCTGCTTGGGCAAGAAATGTAGAGAAAAGATCCG TCTCTGTTGGTTTTATGGTTCTAC TTTG scaffold191 TATTCCCTTCTTGGGTGCAAATGTGACATTAG T C 192TACCACCTGTGTGCTGCAAATTCTTAAACCTGTCATTC 36583:13571AATCTTTACCAATTGTGTTAGCATGCTGCCCT CTCTTTCTATTGATGACTGTAGTCATTAAAAAAAATTATTCTCTATGGGTTACGAATCGCTTCTGAGATA TTAACTTAAACTAAAGTTAGGTTG TCTT scaffold193 TCACTCTCTTCCTAATGGGATTGACCATTAGT T C 194TATGCCTTTGATCGAGCTTTCGAGGGCTCGGTCTCTTC 36500:1728TGGAATGGTTCTGCTGCAAATGGTCCCATGTT CTTCGCAGCTCCCTTGGTTGGAATTCTGTCGGAGAAAATGCTGAGGTTGTCCCAGCCAAGCACCGGACCA TGTTCGGGTATGATTCGAAAGGGG TGAT scaffold195 TAATGGGATTGACCATTAGTTGGAATGGTTCT C T 196CGAGCTTTCGAGGGCTCGGTCTCTTCCTTCGCAGCTCC 36500:1740GCTGCAAATGGTCCCATGTTTGCTGAGGTTGT CTTGGTTGGAATTCTGTCGGAGAAAATGTTCGGGTATGCCCAGCCAAGCACCGGACCATGATCTATGCCT ATTCGAAAGGGGTGGATCCATTGT TTGA scaffold197 TGACCATTAGTTGGAATGGTTCTGCTGCAAAT T C 198GAGGGCTCGGTCTCTTCCTTCGCAGCTCCCTTGGTTGG 36500:1749GGTCCCATGTTTGCTGAGGTTGTCCCAGCCAA AATTCTGTCGGAGAAAATGTTCGGGTATGATTCGAAAGGCACCGGACCATGATCTATGCCTTTGATCGAG GGGTGGATCCATTGTTGGGGTCTA CTTT scaffold199 CAGAGAAAGAAGAGAAAGGTACTAAAATTGGC A T 200GGGACACGTGTCAAGCATCACTCGGTTGAGAGTGGGGC 153198:2269ACAGACTAACCGGGCCTCCATGGTGTTGAGTG CCATCACAGGGTGGACTCGAGGTGGGCCTTTTGCCTGTGGCCCCCACCCACGAACTGTAACGGGGCCCAT TTCATGCTGCGTCAGGTAAAACCA TCAT

TABLE 7 Upstream, Allele and Downstream sequences for SNPs from Table 5SEQ SEQ SNP Name ID Upstream Sequence Minor Major ID Downstream Sequencescaffold 201 TCCAGTAGCCAAGGTCATCGGCGGTGCCCTAAACAAG C A 202CGAATGCATAAGGCTGCTGTAAAGAATGTTTACGAG 13038:51303CTCTCCAAGATTAAGGTTGTGAGGCTTTTAAGTTTGGAACAAGAGGTTCATTCCACTCGATCTCCATAGGAAG ATTGCTCGTGTGTTGACGGTGATATCAAGATCAGGGAGATTTGCAGAAGGCTTA scaffold 203TCTTATTCAACCAAATGCCATAACCGGCCATAGCTGC A G 204TGAACACTCGGCCAAAGCTTGCTGCCTAGCCCCTTC 25092:11841ATAAAAAATATCAGCTATAAACCTCCTGGGCTTTCTGCCAATGAGATGCATAACTGGCCAAGAGTAAGGACAA CTATTCATCCAATGTATCCAGTCCTCGAGAAAAATAAATCAGAATGGCTCCAAA scaffold 205ATGACCGCTGTTCCAATGCTGCCAACTTGGTGACTGA C T 206AAGGAAGACAAAGAGAGCGAGCTACAATCTCTGGCA 23837:26190AACAAAGTTTACTTCATGTTACTAGTGTTCCTCTGCCGCAGCAGCCTTGGAGAAGGACTACAAACGAAGAGCA ATTGGGCAGCTCGCAAAAGGTCGGAACCTGCAACACTGTCCTTGATTACGACCG scaffold 207TAGTAGTGGTAGTGCCACTCTCGGTGGTTTAGCCAAA C A 208AAACCGCTACCGATAAGAGTGTCATAGCGGCCTTGA 152474:1505ATCGCCCTCCAAGCAGCGGCTTCAAGTGCCAAAAACAAGGATTGCAGTGATAATTATGATAGCGCCAATGAGG CCCAAGCCCAGATCACTAGCCTACTCAACTCGGTGACTCGCTCAAAGCTATTGA scaffold 209TGTACAAAGATCTTTGTGAGAAGACACTGCGAGCAGA G A 210AGTGCCAAAAACACCCAAGCCCAGATCACTAGCCTA 152474:1465CCCTAGTAGTGGTAGTGCCACTCTCGGTGGTTTAGCCCTCAAAACCGCTACCGATAAGAGTGTCATAGCGGCC AAAATCGCCCTCCAAGCAGCGGCTTCTTGAAGGATTGCAGTGATAATTATGATA scaffold 211TCCAATTTCAGTGAGAGAGAGAGCTGCTGTATGGAAG C T 212TGAACAAGCTTAGGGATCTCAAGGCTGAGCTCCTTT 13038:51162CTAAAAAAAAGCAGCAATGACAAGAATCAAGGTTGACTCCTTCCAGTAGCCAAGGTCATCGGCGGTGCCCTAA CAGCTGAGGCAGAAGAACAAGACCTTACAAGCTCTCCAAGATTAAGGTTGTGAG scaffold 213ACTGCTGGAAATGGTAGTGTTCTTTTCACTATTTCTT A T 214TCCCTAAATCTAACAAACCAACTTTTGCACAGTTTT 5841:136325AAGTTGCCTCCCAGAGAGGTAGTTTCATCAATGTGCACTGCTGACAAAGTAGCATCTTTTTGGACACTATTAT AGCTGCTGTCTTCATCCATGGTGGTCCATGCAATGCAGCGACCTCAATACTACA scaffold 215CTGTAATAGCTGCTGCTGCTGCTGAAGCTGTTACTCT A T 216CAAAGTACTTCTGTAGTTACTTTAGAAGCTAGTTAT 5876:22669TGCAAAAGCAGCTGTGAGGGTTGCAAAGGATGCTGCTTCGGATTTTAAGGGTGTTCATAATATGAAAACTACA CTGCTAGTGAAAACTTCTGAAAAACACAAGCTGCTGTATTAGGAGATTCTGTGG scaffold 217TTCTTGGTTGGTGTATATTGCAGCACTTTCACCCTTG C A 218AAGCAGCCCAATCTTGTACTCACTTTGATTTTGTTA 764:75880AAGATTAGGCTAGGGGGCACTTTACAAGACAAAGTCAAAAATGGCTCTGAATTATTTAGCTTCTCTCATGGTT TATACCAGAACGAGAGTGAAAAAACTGCTTATCTATGTCTCGGTGGGATCAACT scaffold 219CATATACATGCTGACTGGGCAGCATGTGTAGGACACA C A 220AACAACACGCAATACCGAGGTCCTCAGCTGAAGCTG 32076:9119AGAAGGTCCAAAACAGGTTTTTGTGTTTTCCTTGGGAATTGTAGGGCTATGTGAAACACTAGTGAATTGGTAT GATCTTTAGTGTCCTGGAAAAGTAAGGGCTGCTATCATTACTAAAGGAACTGAA scaffold 221GAGGTAGACTTTGTATTTTGAAGCTTTTGGATATTAG T A 222AGCAAGTCATTGAGGGTTAGCTTAGAGAAATCAAAG 7992:917AGCTATCAGGGTTTTAATGGCTTCCCAGGTGAATAACTCCCTAGGGTTAGCTCTACAGAAGACAGGACCTAGA GATGGCGTTGAGAAATTGTTAGCTGCTTGGAAGAGATTAACCAACGATTACCTG scaffold 223ACACTTTCATTAAATATATGTCCGGTATGTACAGAAC C T 224TGCATCTTTAGACAAATTACACATTGAAGTTGATAC 118405:2916GCAACATTCATGGCATAATTAACACGTCAGCCAAATATTCTAGCTGAGCTAGGCTAGGCCGACTCGCAGCTAA ACATTCAATCATAATCAATTGCAGCATATTTAATCAAATAAAAAGTACCTAAAA scaffold 225AGGTGAGATATGAGAAGGGTCGAATGGTAAAATATAC G T 226GATTGAAGGAGGAGCGGACAAGTGTCGCTCATGTGA 2418:77480AATGTGCAATTTATTGGCTAAGAGAAATAGTAAAAAGTGGCTGGAGGAGAGGAGCTGCTGGAGAGAGAGGGGA AGCACATGTATTTGCTGCTGAAACTCCAGGCGGGTCCCACGCAGCTGGCGCTGG scaffold 227GTTGATACTTGACTAATTTCATATGGTGCTGTGGAAA T C 228TGGTTTTATAAACAGTTTGCCGATGGCTAGTACGCT 34829:1873TGTAACAGGTGGATGTTGTGGATACAGTTGGTTGCGGGTCAATTGCAAACGCAGTTGGTGCTGCAACTGCTAT GGATAGTTTTGTAGCTGCTATTGCATGGGTTGTGGTGCTGGTAGGAATGTGGCA scaffold 229TATAGGTGTGTTCGGCTATGGAGATAAGTGGATTCTT G T 230AGATGGCACGTGCTTGTCACATGCGCATCAGCAGCT 38125:4641CTTGTGCTTAGTTGGGGCAGCGAGAATAACGCACAGAGGTGGGCCGGACCCATCCAAAGCCCATTTTCTAGAA GCTCAAAGGATAGTGTCGATAGCAACGCCAATTGCAAGGATGATCAGCAGAGGT scaffold 231TTCTTACTCCGATGTGCTGTGCAGATCACTGGTCAGT G A 232TTTTTGCTGCTTGTCTTGATATTCTTTCATTGTGTA 37469:74270CAGCATTATCATTCAAGGGAAGCCCCTACTGGATGGCTGATCTTGATTCTATTTCATCCTTCTTCCTGCCTTT ACCTGAGGTAAAAATCTTATGAAGTTCTCCCAGGTAATAAAGAATTCAACTGGC C32100775: 233ATTTGAGATCCTTAGCCGATAGAATTTGTTGGTTAAG T C 234GCCATGGACGCTCCAATTCTTTGTTCCTCTGCTTCC 1618TCTCTGAATCTCCTCATCAGCTTGTTTTAGTTCCTTAAAATGTGCAGCATGTGCAGACTCCAACGACTCCTTC TCCCAATTAAGAGAATCTTGCTCCCTGTTGCAATCAGCTCGATAGTGAGTTCCT scaffold 235TTGGTTCTTCTATTGGGTTTACTATACTTTGGAACAT C G 236ATGACACTGCTTCCAAAGCAGAGAGTTCCAAAGGTT 5190:41424GAATAGGCGAGGGTGGATTGAGTATTCTGTGGCTGCCTGGGGAGTTCCCATGGTTCCATGGCTGCCATCTCTG TTGATTTGGTCCGTGGGGAATTTGGGTCTATTGTGGTTAATCTTTTTCTTATTG scaffold 237GGAACATGAATAGGCGAGGGTGGATTGAGTATTCTGT G A 238AAGGTTTGGGGAGTTCCCATGGTTCCATGGCTGCCA 5190:41454GGCTGCCTTGATTTGGTCCGTGGGGAATTTGGGGATGTCTCTGTCTATTGTGGTTAATCTTTTTCTTATTGGG ACACTGCTTCCAAAGCAGAGAGTTCCTCTTTGGGAATGGTGGCCTTCTTCAGGT scaffold 239GTATTGGACCTCGATCTCTCAACTTTCGACCGGCTTC A G 240GCGCGTGAGAGGAAGCGTTGTTTGCACGGCGAGCGT 49917:1105TGGAGGTGAACGCCCGAGGAGTTGCGGCGTGCGTGAAGGCTGCCTCTGTCGGTTTGAGGACACGGACAGATTA GCATGCGGCGCGTGCGATGGTGGAGCCTGCATGTCGAAACACGCGGTGTTGGGG scaffold 241GTCACCAAATAAATATTGTTTTTTAATATTTCAGAGG T A 242CTGCATATCATTTTACATTTTCATGTTTCTTTCTGC 4775:75747AGGTTTTGCAGGGAGTTGCTGCTGGCATCGATCTTTTTGCTAACTTGGCCAAACTCCGATGATGTAAATGACT CGACTCTGAGTATGTTTGTCCTTTCTGTTTTGCAGGTATATTTACCATCTTACA scaffold 243AAACTTTGGCTGCACCTCCTGCTGAAGAGAGTGTGCA G A 244TGCATTGTTGGCCGATCAGTTGGTTGTTTCTTGCTG 26152:15432TTGCGCTGTTGCTGTAGAAGCCAGTTTAGTTGGCAAGCAAAGCAATGCAAGTTGAAATACCTTCCTAACCGCC AGACTGCTGAGAACTCGTGTTACTTCCCGAGATCCTTACATGTAGCTGTGATTT scaffold 245CACAAACGTTTTTATCAGCTCTCCATCTTCGCTACAA G A 246GGGAGTGCTGAGAATCAATGTTTCTTAGAGCATCAG 5841:135056GTTTCATTAGATATTTGACAGCCAGACATGGCATCTAGCACCAACTCCATAGCTTCATCACTGGAGGTAGCAG ATATGCATGCATTGGATGCAGTAATCCGGACACTCCATTTTGATCAACAGGCAA scaffold 247CCTTAGCATCTCCTGATTTTTATGATGTGAGTCTTGT A G 248GGATGTGATGGCGACCTCCGGACCAGTTGCCCTAAC 5876:106401TGACGGTTTCAACTTGCCCATAGTCGTCACACCACTCGAGCTGGCGGTCAAGAGCAATGGGAAAACGATTGCC CACGGGCAAGGAAATTGCAGCGTGGCTGCCGGAGCGCGTGTGACGTGTTCGATA scaffold 249CAAGAACTTGCCTTGCTGCTGCATCTGCTGAACTGCA A G 250ATCAAAACTTCCATGCCACGTGCAGCAAACAGTCTC 5113:138895GACAACATTCATTGCTCCAATCCTTTCTGAATAAAGAACCGATGATGCATCAGCATCAAGGCTTCCACTAGCA CCCAAATTTTTACTGTAAGACTGAGCAATCCCTTTAAAGAAAAATAAAATAATA scaffold 251TATACAACAGAGACTTCACTGCCAAACTTAATCTCTC G A 252TCATCATCATCATACAACCAGAGAAAAAAAAAAACA 5113:239777TGTTTAAGCAGCACCACCAGTCCACCACCAAACACATGTGATGCATTTTACAGAGAAAGAAAAAAGCTGCAAT TTGAGCAAACACTACTAGTTCTCATCGATTTCTAAGCAGCCTTTTTTTAAACAA scaffold 253TTATCAGCTCTCCATCTTCGCTACAAGTTTCATTAGA G A 254GAATCAATGTTTCTTAGAGCATCAGGCACCAACTCC 5841:135067TATTTGACAGCCAGACATGGCATCTAATATGCATGCAATAGCTTCATCACTGGAGGTAGCAGCGGACACTCCA TTGGATGCAGTAATCAGGGAGTGCTGTTTTGATCAACAGGCAAAGAGAGATCAT scaffold 255ATACTTCATATTCCATTTCTCTATCTCCACTCTGCCT A G 256GTTAAATCTTCCACCTTCCGCTGCAAAAGAGGAACA 25099:7321CTGGCATGAAGTAGTCCTTCAGGTGGTTTTTGCAGCATTTATAAATCAATGTACAATATCAGAACTAGGGGTG GAAAGATCTACGGCCTGAAAACTCGCTGCATAAATCGATTAAATCCATTAACAA scaffold 257GCTAATGGCTCTTATTGGTCAGTCGTTGAAGTGGCAG A C 258TCTTTAAAGTAACAATTTGTTCTCCCATTCCAGGGT 14172:40875CAACACCAAGGTATATATACAATCTTATTGTGCATTGTACTTCCTCCAGGCACACAATTTGACTTATTTAGAG ACAAGAGAACCATATTATCTTGCAATGAACAGCTGCTATGAAACAAGATGTAGA scaffold 259AAAGGGCATTCAATCCTTACGGTTCAGCTGCTTACAA A G 260AAACCCCTTCACGTTTACGGAAGCCCAACTTACGAG 158332:502CTTCATCACCATGGGAACTTACAGAGGAGGACTTAACTGTCAATGGCTGCAGAATTCTTCTCCCTCCTCCGTT ACCTTCGCCATTGTTGCAATAGCTTCATCTCAACTGTTGGGTACAAGATCCTTC scaffold 261CTACAAAGGTTGCGTTTGGACCGAGCTCCTTAGCCGT T G 262GTGGCTGTTGCTTTGCCTATTCCGCTTGCTGCTCCT 16869:149709GGCTAGGCCGAGATTGTGTTGGATGTCTGCTATGACTGTTATTAGTGCCACCTTCCCATCAAGCCTGTTATTT ACTTTGGCACCATTGTTGATGAATTTTCAATGAGTGAAGAAATTGTCTGAGTAG scaffold 263AATTGGCAGAGGAGAGTTCATATGCCAAAGGTTTAGC C T 264GAGAGACTAGCAGCTGATCTAGCCGCATCCAAGAAC 114539:1119TTCAGCTGCTGCTGTTGAACTGAAGGCATTATCTGAATCCCCCACTCAGCGCAAAAGTGGCAGTATGGTCAAG GAAGTTGCCAAACTGATGAACCATAAAATGGGCGAAGAGAGAGCATGAACAAGC scaffold 265TTCACCCAAGAAAGAAAAATGTGATTGTCACATATGT T G 266TTATTGTTATGAAATCTATGGTTGTTCAGTCTAGTG 3842:272682TTGCTGCATGTGCATGGACCATTGGGTTTAATCTTGCCATTTCACCACAAAAAAAATGTGATTGTCGTATATG ATCATGCTGTTTCTATATTATGATCTCGTGCTGCATGCATGGGCCATTGGGTTT scaffold 267CACCAGCCCGCCTCATGTCATGAAAGGTTTGAATAGC G C 268ATGCTCAATGGAGTTGCAAAGAATATATCCCTTGCT 60331:8510AGCAAAGCCATCATTGTTATGAGAGCAACAAGTGATCAACTTAAGACTCCCAGCAGCAGAGTAGGCAGCAATC ATGGCATTGTAGAAAACAGTATCTCTATTGTCGTTCTCGAAACAATATCTGGTT scaffold 269TTCACGGAATTTTAAATATTTTTACGATTTTTTTGAT G A 270GGAGAGAGTTAGAGCAAAGACGATGGTATGGAAGAA 2360:22109TTAAAATGAAATACCTTCAAATCCTCCGACTGGTGTAGTGGCTAAGTGCTCTGCTGCAGTAGTACAAGTAGTA AATAGCTGCAATGGTTGGACAGAGAGTATATATGGAAGGGTATAATGGGAAATC scaffold 271GCAATTTAAACAACCTCTCCTTCACTGAACTCACAGA G A 272TCGGCATAGTACACTGGCGGTACCAAGGAGACAGGT 72613:1728AGACGATGACAAAGATCTCTCCATCTCTGCCTCATGATTGGTGCAGCGAGCCATGGTAAAACACATGCTATAT TACAGTCTCCCTCTATAGGCAGCAAGATGAGCTTCTGCAATTGGTCAGAAGTAA C32052717: 273TGGTCATGACGTTGTACTCAACTGAATGTGCGCTGCA A T 274CGGTCAAACATGATGACACGGTCGTTGTTGAGAAGC 309GTCTTTGGCCCCAGCAGGGCATTTCCCAGGCGCCAATTGCATGTGCATGGCCGAAATGCCAATGCTTGGTGTC GATATGTTAGAGGTGCCGAAGTCAGTAAGAATTTCCACCGGCCACCGGCTGCAT scaffold 275TTTCGCCCAAAAATTTGATGCTACCGAAACCGTGTCG C T 276TCTTGGATTCGTGGCTCGGTGTAGTAATCAGAGCGC 71943:13435TCCAACAACAAAGTCCTTGACACGGCAGCAGAATCCTCTGAGCTTTGGCATTTGAGCCTCAATATCTGCACCG GGTTCGGCCCTTTCCTTGGCAGCAAGTGCTCGTACACAATAGCAGCCTCACCAG scaffold 277GTCCAACAACAAAGTCCTTGACACGGCAGCAGAATCC T C 278CTGAGCTTTGGCATTTGAGCCTCAATATCTGCACCG 71943:13471TGGTTCGGCCCTTTCCTTGGCAGCAAGTTCTTGGATTTGCTCGTACACAATAGCAGCCTCACCAGCTCTGTGC CGTGGCTCGGTGTAGTAATCAGAGCGCCGCTAAGGGTCTTATAAGAATCCCCCT scaffold 279CCTGTTGTTCCCGAACCTTGGAAGCCACTACATACAC G T 280GTAAGGGATGTTCATGGGCATGTGACTACTGCTTAC 128544:170TGCCGCCTCTTCTTTCAGGTCTTGAAATGAATGTTGATCGAAGGCACATTCAAATCACATGAGATGAAGTTAT TGCAGCAATCAGTATTTCTACTCGATGACACTTTTTCATAGCCTTCTTTGGGCA scaffold 281TAAGAGTGAGAATTTTATTATGAAGATTGAGGCGATT T C 282TGAGTTTGCTGTTGCTGCTGTTGTTGTTGGGAGCGT 98263:1069ACCAGTGAGTACAGAAACGTGTTCTGGGAGTTTAGGGCGAAGAGAGGAGAGAGGTGGTTCGATACGGAGGCGG GCATTGGGGTTTTGGGTTTTTTGTGGCGCTTCCTGCGACGGCGCTCAGCTGCAG C32058675: 283TTTTTGTATGATTAATTTATTATAATATACAGATAAT A G 284AATGGGCCTAGGACTGGGCCTAGAATATCAGCAGCT 292AACGAATGGGAGATTCAAGAGCGTGGAGCATAGAGTGTGCTTTTTTTATCCAAGCATTCCCAATATTGCGAAA TTGGCAGCTGGGGGAGGTGGGCCTGACCTTACGGCCCAATTAAAGAGCTTCTCT scaffold 285ACCTGAAGATGCCTAATTAAAGGCATTCCATTCTTCT C T 286GAGCTCTTCTATTTCATCTGATGTCAACAGGTCCCG 823:11824TCCTTTTCAGCTGCCAATGCTCGTAAATGACTTTAATCTGTTGAGAATATGCAGCCTTCTCAAACATGTCCAT TACATCCATGGGCCCAACTCCAGCAGAATTTTCTCAAACATCTCTTCAGATATC scaffold 287GGTTGGGGCTACAATAGCTATGGTCAGGCAGCCAATG G T 288TAGCTTTATCCTTAACACTGCTTTTTCGTTTTCTAT 75287:5899AGAAATCTACCTATGCTTGGTTTCCATCACCAGTTGAGTTCATGGAGCTCCCTTTGCAGGTGCGTTGGCGAGG TTGGTATGGCAAAATTATATCGTTCATGCGAAAACTGGCAGCTGGTGGTGGCCA scaffold 289CGAGCTCCTTAGCCGTGGCTAGGCCGAGATTGTGTTG C T 290CCGCTTGCTGCTCCTGTTATTAGTGCCACCTTCCCA 16869:149730GATGTCTGCTATGACTACTTTGGCACCATTGTTGATGTCAAGCCTGTTATTTTCAATGAGTGAAGAAATTGTC AATTTGGTGGCTGTTGCTTTGCCTATTGAGTAGAAAGAAGGTAAAGAAAGCAAT scaffold 291TTATTCATGTGGAAGTAGCTCTATAAAACTGTCTCTA C T 292ACATATCTGTTGTACCTCTTTCCTGGAACAGCTTCA 3842:543149ACATGGTATTTAAAGTGTTTGCATTGGAAAAATGTTCTGTAGCTGAACTAAAACCCACAGAGTATAAAGCTGC TTACCAAATGCAGCTTGAGCCAGTTCCAACAGTAGGCTATGGTCAAAGAAAGTA C32058613: 293TTGACATTACTAACAATTACTGAAGCCCCCATTACAC T C 294AGAAGCAAGCACCTATTTTTTGCTCTTCGAACTCTT 508TAATTTGGGGCCAATAAGCTTGACTCCTCGCTTTCCACCTCTTCCTTATCCATTTTATAAGCTGCTAAATCAT AACGTGTTATATTGTTGCTGCCATAAGAACACTTAAGTCTTCTAGTTCAGCCAA C32058613: 295CTTGACTCCTCGCTTTCCAAACGTGTTATATTGTTGC C T 296ATAAGCTGCTAAATCATGAACACTTAAGTCTTCTAG 563TGCCATAACAGAAGCAAGCACCTATTTTTTGCTCTTCTTCAGCCAAATTTTGAGCACTTCTATACGTATTCTG GAACTCTTCCTCTTCCTTATCCATTTTTGCTAGGACAGAAAAAGAAAAGGTGTT scaffold 297CGGTGGTGGCTGTGGTTGGGGTATGCACGTTGTCGTT C T 298TGGGCTTATTTCGACCGTGACTTCTCTCGCTGGATC 7146:70340TAAGAAAATAGTGGGAACATACTTGGTGGGAGTAGTGCACCCAGTTACCGCTGATGAAAGGGCCTCTCATGCT GGCCTTGCCGGGGTGTTTTGCCCAGAGCTTCTCACAGATCTGGACTACCAAGGC scaffold 299TGCAATCCTGTGGTGCACTAGAAACCGCCGCCGCATC G A 300CCGTGTGGCTTCTCGTTCTGTTCTCGCAAGATTTGC 23125:41276AGCATCTGTCTCCTTCCCACTGAAAAGCTCCAGTTGCAGCCTTACAGTTTGGGAGCTCAGATTAGTAGATGGG AAACTAGACAAAGCCTCTTGCGTAAGGAACGGTCAGTTGATAAACTTGTAGAGT scaffold 301AAAATATACAATGTGCAATTTATTGGCTAAGAGAAAT A C 302TCATGTGATGGCTGGAGGAGAGGAGCTGCTGGAGAG 2418:77508AGTAAAAAGAGCACATGTATTTGCTGCTGAAACTCTGAGAGGGGACAGGCGGGTCCCACGCAGCTGGCGCTGG ATTGAAGGAGGAGCGGACAAGTGTCGCAGGCTGGGCAGGAGGAGACAGGCGCTT scaffold 303CCAGCTCTGCCGGATTCCAATCCGTACCGACCTCGCC T G 304CCGGCGAGTCATCTTCAACGGCAGCAGATTCTGATT 12000:86305TTGGCGGAGCCTCTCTCCGTAACCGGAACCCTCCGCTTCGATGCGAAAGTGTTCCGTAAGAACTTGGTCCGAA GCCGGAAACCTTTCGTCATCCGATGCGCAAGAACTACAATCGGAAAGGTTTTGG scaffold 305GTTGCTTTATTTGAATTTTTGAGCTTCATGATCTGGC T G 306GGGGTCTGTTCTGCACACAATAAAAGAACATTTTTA 24181:60784AATGGAGTTAATACTACACAGCTCACTAATAAACGACGGTATATACCAAGATAGGCTGCTCTGGACATCATTT AAAAGTTTGTTGTCGTTTTTTGTTTCAGCAAGGGTCACCATCTTCTAGCAAAAC scaffold 307ACAGTACCAGTACATCAGGAAAGAGCAAGAGCAAGAG G C 308ACAATCCTGCTCCCCAACACTCTCCAGACCACCCAA 26621:72993CAAGAGCAACGACGCCTACTACCAATAACCCAAATAGCACATCCAATACTACTACTACTACTACTACTACTAC GGCAGCAAGGATGATGATGATGACCATGCTGCCTTCTTCAGGGGTAGCACCATG scaffold 309ACGCCACTGTGTTGACGTAATTTCTACATGTAGCAGC T C 310GCCGACGCTACTGTGTTGACGTGACTTCTACATGTA 6391:16360GTAATGACAGTATTCTATTGCTGGAACGCTATTTGCTGCATTGTAATGCCAGTATTCTATTGCTGCAATGTTA GAAATGTTATTTCTTCATTTTTGTGTTTTCTTCATTTTTATGTCGCCAATGCTA scaffold 311GAAACCTCTGTGCCTTGGAATTCTTGCTCTTAATTAG A T 312ATTGCACAAGGTGTGTAAGAACCAAAAGCACTCTCT 88759:12655CCATATTCGATACACACCTAGGCCCAAAAGAACCAAAACTATCTTTCCCCAAACCCCATAAGCTGCAGGCTTA TGAGAAGTAGAAATGACTAAAGACTCCAATACCAACTTAGTAGCTCAAACCTCA scaffold 313TGGATGGCACCTGAGGTAAAAATCTTATGAAGTTATT C T 314GCCTTTCTCCCAGGTAATAAAGAATTCAACTGGCTG 37469:74336TTTGCTGCTTGTCTTGATATTCTTTCATTGTGTATGACAACCTTGCTGTGGATATTTGGAGCCTTGGATGCAC TCTTGATTCTATTTCATCCTTCTTCCTGTTTTGGAAATGGCTACTACAAAACCA scaffold 315TCTGCCGGATTCCAATCCGTACCGACCTCGCCTTGGC T C 316GAGTCATCTTCAACGGCAGCAGATTCTGATTTCGAT 12000:86310GGAGCCTCTCTCCGTAACCGGAACCCTCCGCTGCCGGGCGAAAGTGTTCCGTAAGAACTTGGTCCGAAGCAAG AAACCTTTCGTCATCCGATGCGCCGGAACTACAATCGGAAAGGTTTTGGCCATA scaffold 317GCTTGACACAATCATAAATCCAATCAGAGGTAATTGT G A 318GTGACCTTATTTGTCAATCTCTCTACTAATTTGGTT 6143:103796ATGTATCCCCCATTTACAAGCAGCCTCATACTTTGGTCCAAGAACAAAGCATAAATTTCTCAAAAGCAGCCGA CCACTTGCAAACTTGCAAATGAGATGTCTTTCTCTTCATATTGTGAAACACAAA scaffold 319GGACTCAATCACCAATGCTGACCCCAAAGCTGCATTC A G 320ATGCCATATTTGTTTTCGCTACAAAATGTGGCATTG 33135:78155ATTAACTCAAAGTAAGAAATCTAGTCCTCTTCCAATGCGAGCTAGCTGCTGGGAAATTAGAATTGGATCACAA CTTAATTGGCCAACCCGAGCTCTCGCACACAGCAAATTTATTTGAAATCCCTAC C32058675: 321TATACAGATAATAACGAATGGGAGATTCAAGAGCGTG G A 322TATCAGCAGCTTGCTTTTTTTATCCAAGCATTCCCA 317GAGCATAGAGTGTTGGCAGCTGGGGGAGGTGGGCCTGATATTGCGAAACCTTACGGCCCAATTAAAGAGCTTC AGAATGGGCCTAGGACTGGGCCTAGATCTCTCACACCAACCCACCACTCTACAA scaffold 323TTTCCAACCATCCTTGGATGGCTTCCCTTTCATAAGT G A 324TGTGAGCCAAGTTAAAGAACAAGTTTGAACCTTACA 1976:5193GAATCCATCAGCAGCCACCTGAGGGTCATGCATTATTCCCAAATTTCGGGAAAGCTTGAGCTTGTTTAGAGTT TCCTGAAAGTAAACCAGAACATTAGGGGGGTTTGATTGAAAGACATGAGAAGCA C32098343: 325AAAGCTTCAGACAGTTTAGAGGCAAGCTGTTCCAGAG T G 326CCGCCATACATCAACATTATTCTCACAAACCATGCA 2061GCCTCAAGCCTACAGCTATCACTCTTTTGTACGTTTCGAGATCAAGGTATGCAAGATACTGAAGAAGAGACCT ACCACTTTCCTCGAAAAAGGCAGCCCCGGGCTGCTCTCTTCTAGTGCTGCAAGG scaffold 327AAATTCTAAACCATGATAAAGAAAAAAGGTTAGAGGA A G 328GGCAATTATTACCAGCTTGCAGTTTGTTAAGGGAAC 158089:295AAATCAATCATGAAAATCTCTTTAATAACTTCTGCAGTCTTCTGAGCAGGGCAAAGCTGGGCAAGAAGAGAAC CATTACTGCTATTTTGTAGCAAAGTACCACTGCAGCCGCATTTTGAATCGCAGG scaffold 329TTGACCCTTTTAATGTGGCAGCCCTTCGTTGCGCGGC T C 330ATGAACCAAGTTGTACTGCAGAGTTGGGATGACACA 17267:6243TGAGTTTCTCGAAATGACTGAAGAATACTGTCATGGACTAATAGTCCTCCAAAAGTGCCAAACTCTGCTTCCC AACCTCTGTGAACGCTTTGATCTCTATGGTCTGAGGAGCTTCTGATTGTGAGCC scaffold 331CCTAAGCTAAATTCTAGCTAATTGTAAGCCGAATAAA A G 332AACAGTCCATAATCCTTTTGGTTTCGTCGCAGAATG 491:22100AAAAAACCCTACAAACTGCTGCCCCTTAAGTTTAGATCAAGCAGCCAAAAAAGGAATAACTAGAAATGCTAAT CGATGAGAGGCAGATGAATCATGAACCAAATTTACATGGAATTTCCTTTCACCC scaffold 333CCAGTGAATGGCGACTATTGCAGCGTACGGTGGACAT C A 334TCATTTCCACTGTACTTGTTCCCAAGTTGGTCTTGA 121522:9070TATTATTATGTCCTCTTCCCCACCTTGGAAGCCGTCGATTGCAACCACTTGAGCAGCATATGGAGACACAATA ACTGTCTTCACCTTTACTTGGAAGCCCCAATACTGAGCTCACATTTCGACTTAA scaffold 335GGGCCAAAGCAGCAAGATGGTCAGGCCCTGATAAAGT G C 336CCCCCAACTGCTGCAGCAGCAGCAGGGGCACCTGTC 24615:2202GTGCAAGCAACCAGCAAAGAAACCAGTCCATGCACTATTAGCTGCAGTTTGAAAAGTTGCAAAAGCTGCAGGT CTTAGCAGTTCAGTCCGAATCAGTGTGCGAAAATTGGTTGAATTAAGATCATAA C32052323: 337ATCATGGTTCTTGAGAAGGTTGGTACTTTGCTTAGTG A G 338AGCGATTTTCATTTGCTGCTGTGTCATAATATAAAT 699TTTGAGTTGCAAAAGTTGATGGGAATTTGGAAAGGAAAAATAAGCATAAATAGGAGGAATAATAAAAGGTTAG GGGAGTGTGGGTCTTGCTCTTTTCTAAAATGGGATTTTTGAGCTGCTTATGGTG C32052323: 339GAGAAGGTTGGTACTTTGCTTAGTGTTTGAGTTGCAA C T 340TTGCTGCTGTGTCATAATATAAATAAATAAGCATAA 711AAGTTGATGGGAATTTGGAAAGGAAGGGAGTGTGGGTATAGGAGGAATAATAAAAGGTTAGAAATGGGATTTT CTTGCTCTTTTCTAGAGCGATTTTCATGAGCTGCTTATGGTGGTCTAGTAGTAA scaffold 341AATGACTTCTGCACTTCAGCTCCTTTTGATCTAGGGT G T 342TACTGGTAATGGACATTCAACTGGTGTAAGACCTCT 14925:8868AGTGTGCAGCAATAGAGCCAGGTTTCAAGGAAAATCGAATTGCAGCCTCGTATATCTAGAATAAGTTTTCAAG ATCCCTTGGATCCGGAATAGGAAATTTTAGCCCACTAAAATAATTAGATGGAAA scaffold 343GCTTCGTAGCAAGATGCAATTAGCATAAGCAGCCAGA A T 344GTAGAGGTCTAAAGCAGACGGAAAAGACATTGGTGA 133681:2742ATAATTTTTCTGTAGAGTATTGCTGCCTTCAGTAATGGTATTATAAAACATAGCTAAAACGGGGTCGCTTTGT CAGTGAGGTTGTTTTCTGGTGGACTTTGTGTGGCTGCAAGGCCACATGGATCTC scaffold 345TTCATCACCATCACCATCACCACAAGTATTTCAACCA G C 346GAATCTTCGAGCGAAGATCTTAATCTTTACGGATCC 9639:84033CGGCGCCCCTCAGATTATTCAGCCGATGAATGTTGCTACCGCCGAGGCTGCTGCTGCTGCAGCGGCGGCACCG TTCGGTGGTGGCGGAGCAGCAGGTACCCTCCTTTCGGGTCAAAGAAGCGGTTCA scaffold 347CAGCAGTCAACCCAGCAGGACCAGCTCCTATGACAAT T C 348GCAGCTTGAATAGGGGATGACTCTTTTTCTACATCC 61482:2893AATTTTCTTTCCAACGTTTGTATGACACCGTGGATAAAGAGTCACAATAGGGACTTTACCAACTTCCAATGTT AGATTCTCCTTGGCATCATCATTAAATCACATGATATTTCTGTACTTCCAACAT scaffold 349AGAGGTCAACCCTCGCATTCTTCGCGGGGAGTATGCA C T 350GGTAAGTTACCTAGGAAGAAGGGCAACAAAGTATAT 30395:12115CGGCTATGTTCGGCCAATTCTGCTGCAGCATTGGGAAGTCAACTACATGAAGAGCAGCAAGTATTGCATTTGT AATAAAGACCCCGACATGAAAATTTTGCAAAAGGGTATGAAGTCAATAGTCCTA C32064647: 351GAGCTAATTTCTGGAAGAAACCCTGTTGATTATAGTC T C 352AATTATAGGCTGAAAGTGAATCTATATTTGCAGGTG 1071GACAACAAGGAGAGGTTAGTCTTGATTATGATTTGTTAATATGGTTGAATGGTTAAAAACTTTGGTTGGGAAT AAACCTGTAGTTTGCTTTCCTGCTGCCGGAAATCTGAGCAAGTAATTGATCCCA scaffold 353GTGAAGCAACCTGTGAAAGCAGAGGTGCTTGGCTTCT C T 354GTCATGTCTTCTTCATCGCGCTGCAAATGAAGAGAA 2452:1249CTTGGCTGCTTGCATTTGAATGTAATACTCTTTAAACTTATTAGACAAACTCTTAGTATTTGAACTTGGTTGA TCAACCGGGCAATTCCTAACTAGTTCGAATTGTACTAAAACTATAACTATTCCT scaffold 355AAGCAGTCACACAACAAGAACCTCCACGAAAATCTTG G A 356CCATCCTTGCTCACTAATTCAGCCAATACGTTCTTT 11436:6161TTTAAGAAAGTCTGAATCTGTGTTCAGGTAACCATGCTCCAAGTTATCAGCTGCAAATGCTGCAGCTTTTGAG TTAACTGCCTCCTCAATGTGGTCATCCCTCCATGTCCATCAAATACACCAAAGA scaffold 357CCTTCTCCTGCAACTGCAATCAATGAGTCTCCAAAAG G A 358AGAAACACCATCAATCCCAACTCTCAAACCCCAAAC 4156:16965AAACACCATCTAAACCATCTGTAAAACCTCCAACACCATCTTCTACTCCAAGTCCCAAATTTAAAACACCATC ATCTGGTGCAGCCACTAAGTCACCTATCGTGCAGCCACTGAATCTCCAAAAGTA scaffold 359CCTTCTGCCATCCCATTGTCAAGTCTGGCTGAGGCAG T C 360TGCTGCTGAATCATTTGTTGTTTCTTGATGTAACTG 43435:8325GAAATCTATTGGATGGTGGAGACTGATCGCTGGTAGCTGAAGATGAATGCTTATTCCTACGGCGTCCAGCACC ATTATTGCTGTCTTCTTGTGTAACTACACAGGTACATTTCTAATTGTTCCCCCG scaffold 361ATTTGCAGGGAAATGAGCCGCTGGAACTGCAACTTCT T A 362CAGATTCAGTGAATTTAAACATACAGAGTACAAATA 11297:60144GCTGCTTCTGGAATACCCTGAAGAGAAAACATCAATAGTCGACTGCGCGTTCTGGATTGTTATATGCTGCTCG GTTAAAACAAGATAAAACAGGTGTCAAAGTGCACGAGTAACAGTTTCTCTGTCC scaffold 363GCAGAGGTCCAACTGTTCGTGCCGTAACGATAGATGG A C 364ACTCTCGTATCTCTTCCTAACATGAGTAAGGAACAA 51841:4904CAGCAACTCAAGCATGGTCAGTTTTGTGGTTGAGCAGTGGCAAACCATTGCTGCCATGTTTGATAACGTTCAA CCCACTTCATCTACCACTTCTTCAAATATTCTAGCAATCGTTTGCACAATGAGT scaffold 365TTGAAGTGTGCTCGATAGAAGTAGATGTGGCCCGAGC A G 366GTGACCACGAATGAAGCTGAAGCTTGGGAAGCAGCC 38015:40961TTCGTTGATTCTTTCAGTGGGAATTTCAACAAGGTATAAGAAAGCTTCAGGAGGTTTGCATTTCCTTGCCATT GTGTATGCAACTTACAAGAAAAACCCCAAGAAGACTTGGATTCAGATGACTGTG scaffold 367CATATTATTATTATTATTATTATTATTATTTCTATTT A G 368GCTTGTTTGATGATGATCTTCCTCCTCCTCCTCCTC 16206:63417AACTGCTGCCCTTTATTATTTCTCCATTTCGATAGCCCTCCTGCTGCTTCGTTGGTAATGGCGGTTGCAGAGA CGAAAAGCGAAAAAGACTTGTTGCCCCATCTTTGCTCTGTTTTTGATGCTCCAT scaffold 369CCTTCCACCTCTACGCTCCTAAGTCCTTCTCTACTCG T C 370TCCTCTGCCACAACCGCCGAGGCTCCGCAGCCTAAA 13781:319CTTCCCCAACCCCTTCACCGCTTCACGCCGCTCCACTTCCTCTCTTCTTACTTTCCAGCAAGCCATTCAACGT GCAACTCCACTATCTGCAATTACTACCTCCAGGTTTGTTTCCAATGTCTAATTG scaffold 371TCAGAGACTAAGAAACATGCCACTATAAAAATACGCC T C 372GAAATCCCAGTCATATTTAAAGATTAACTCAAACTG 27023:20610ATCTTGTTAATACTAAGAAATGTCACCTCACCGCAGCTTGCTGCAAATCTTCAAGTGTAGCAGAAATGTTCTC AGGCACATCTTCTTCCAGGAAGTTATCTGCAGGTCCATAAACTGTTCAGGGAAT scaffold 373TATATTTATATAAGAAAAAGATGAGAGAAGAGAGGAA T C 374TCTCTATACATGTGAACCGTAGCCATTAGCCCCATT 4618:83522GATTTAGAAAGCAGCAAAGGACCCATATCCTCCTCCATCCATTAGCACCATAAGGGACCTTGTTCTCTGTTTT TTAAGGCAACTAAACCCAACTAACTACTTCTTCTTCAGTACGTAAAACCATGTA scaffold 375GCAGCCTAGCAGCACTGCTTGCTCCACGGCCACGGCC G A 376GGAAGAATTTGCAGCCTTGGATTCTGCTCGGTGCCT 111383:2928ACCAGATGGTGATTTTGAACCGAACTTCGTATATCTGGAGTTTTCAGCTGCCAACTGATAGCTTTTGACCAGC GAAGAACTCTTCCTAGTTAATCTTACTCAAGCTGTCGCGCAATTATTTCCGAGC scaffold 377GGTCATTCATTAATGTATGGTTAGTTATTGGCTGCTG C T 378GATTTTTATTCAAGAACTGAATTCTTTTATGCAGGC 35:25795GCACTTGTCTGTTATGCATGAAATACTTATTGTTATATGCTTACAGCTACAGTTGCCTGGCCTTTCCCAGCCA 293 TGGTGAGTTTTCAAAAAACATTTGCATTATTTACATTGGAATTATTCTTTCGAC scaffold 379TGATATAGTACAGTGCGTCCATGGCGTCATCGTAGGC C G 380GCGGCTAAGTTGAATGTGAAATTATGAGTCACAGTT 122455:2010TGCAGGCAGACGGTGCTCCGGAGCCAGACGATACTCCGAGGCTGCACTGAATAGGATGAATCCACCGCCATGG ACAGAAACTACAATGGCAAAGATTTCAAGTAGACTATAAGAGGCAATCGTAGCT scaffold 381TAAATTGGTCGTTTATATTAATGTTATAAAGAACGCC A C 382TCTTCTATCTTCTAGGGTTTGGTAGTTCCCTTGGAC 13362:101120CTAGCTTTAGCTAGAGCAGCTAGAGCGTGAACCCGGAGAAATCGTGGCTGCTTTACTGTTTTTTGGGTTTCAT CTAGTTTTTTTTTACTTCTTGCTCGTGGAAGATTTTTATGAGAATCGAAGCAAT scaffold 383CCAGAGACCATATCTATGCGTAATACATGTAAAGGCT A T 384GTGAAATCGAAACCATTTCCGTTTGCATCAGTTCCA 38557:23764ATTTGAAGCAGCACAGGCAAGTGAGTTTTCGGTTGGAGAGGTATCCGCACCTTTGTGGATAAACAAGTCAGTT TGCCAAGCAAGATGGAGCAACTTTGTAAGAGACAAAATGGGATGAAAATTAAAA scaffold 385AAAGGCTATTTGAAGCAGCACAGGCAAGTGAGTTTTC G A 386GTTCCAGAGGTATCCGCACCTTTGTGGATAAACAAG 38557:23794GGTTGGATGCCAAGCAAGATGGAGCAACTTTGTTGTGTCAGTTAAGAGACAAAATGGGATGAAAATTAAAAGT AAATCGAAACCATTTCCGTTTGCATCTAGCTAAAAAGTGGGCAAGACATACCTC scaffold 387GAAGGCGAATGACTTGGCGTTGGAGGACTGAGATAGC G A 388GCGTCCTCTCTCTGGTGAGGGAGGACCTCGTTCAGG 50091:2544GCCTACGCAGCCGTAGACAGGATCTTTCATCCTGGCCAGCTTGCTCACATTGCTCGCCCCAAATATTTTGTGG TCGGCTTCGTAGGCCAGAGAATTAACACGTTTGCGAATTTCTGTGGCTCTTCAG scaffold 389GTGATCTTCCAACATCTGTTGATTGGAGGAAGAAAGG A T 390GCAGTTGAAGGTGTCAACCAAATCGAAACAAAGGAG 16614:72706AGCAGTCACTGGAGTCAAAAACCAAGGCAACTGTGGTCTGGTATCTTTGTCTGAACAAGAATTGGTTGATTGC AGCTGTTGGGCATTCTCAGCTGTAGCAGCTCGAAAAACCATGGTTGTGAAGGGG scaffold 391AGCCATCAAAGAATGCGAAATCAACAGGGAAATTGGG A G 392AACTTAACGATGGCCACCATGAGATCTTTTATGTCT 4877:4542ATCGTTGTGGAGGCTGATAAAAGGATAGATGTTTACAGTTCTGAAGTCACCGTCGGAAGGTTTCTCACTCGAG GTAAAGGAACCTCCATTGTCGCTTAACTGCCATATACATCAGCGTTTAAAGGGA scaffold 393TAGCAAAGATCGAAGCCTCACATCAACACTCACAGAT A T 394GCACAACACTCGCAGATCGAAGAAAGGATGCCCACG 65894:5390CGAAGCTGCCTCACATCAAAACTCACAGATCAAAACTTCTCACAATTTCGGTTCCAACTTTGCAGCTTTTCCT GCACAACAACACTCGCAGATCGAAGCCCCAGCTTCACCTCATTGCTGACAAGTT scaffold 395CATCCGAAGTAGGATCATGTGCGCGGTTGCGGCGATG T C 396CTCCCTCTTTCTTTATACTTCTACACAGCAGCAACT 90107:10791GGCCCCACTTCACTGTCCTTTTCTCACAACACTCAGGACTTAACTTAGCCTCATCAAAATTCAGCCAGGAAGG ACCCACATACACATAACATAACCCCTACCAGATGAGATGACCTTGCTTCTTCCT scaffold 397CAATGAAGATTCAAGACCAACATGTCAATGCCCAAGA A T 398GCGCTGAAGACGAGCTTACTCCCGACATAGAAGATC 68873:1704AAGTACTCTTTTATTGATCCCAATGACGAATATGGAATCTATGATGTTGAGGAGCTGCGCAATGTAGATTGGC GCTGCAAACCCGATTTCATACAAGGCCCTTATCAGATTATGTTGCACTGAAGCC scaffold 399AGGTTGAGACTGCAAGGATTTACAATGTTGATGATCT C T 400ATTGCTGATGAATCAAAACAATTTGAAGCATGGAGG 3842:188176GAAAGAGGTTGTAGCTGCTAATAAAGAAGATCGTCTCGACTCACTGGAGACTGTTCCGACCATCAAGAAACTG CGCAAAGCAATGGAGGCTCAGGCAATAGAGCTTATGCTGAAAGAATAAGGGCTG

TABLE 8 Minor Allele Minor Allele SEQ SEQ Minor Major FrequencyFrequency SNP Name ID ID Allele Allele FST (Indica) (Sativa)scaffold6803:13242 401 402 A G 0.733989 0 0.7745 scaffold543:54226 403404 A G 0.722743 0 0.7647 scaffold281:231978 405 406 A T 0.720457 0.97220.2049 scaffold729:175391 407 408 G A 0.708923 0.01389 0.76scaffold281:232029 409 410 G A 0.70327 0.8333 0.1048 scaffold2409:27009411 412 A T 0.6646 0.8243 0.1228 scaffold25:303031 413 414 A T 0.6501140.9306 0.2241 scaffold63:301758 415 416 A G 0.631811 0.01515 0.7037scaffold92:214563 417 418 A G 0.628155 0.8714 0.1875 scaffold3286:29761419 420 C T 0.6277 0.04167 0.7188 scaffold823:18400 421 422 A G 0.615260.02703 0.6923 scaffold1697:24776 423 424 C T 0.592909 0.8333 0.1786scaffold414:37377 425 426 A G 0.592212 0.9028 0.2414 scaffold5405:41764427 428 C T 0.591134 0.8611 0.2054 scaffold6803:13272 429 430 A T0.591045 0.7414 0.1132 scaffold2317:118844 431 432 T C 0.587962 0.85140.1983 scaffold370:274852 433 434 A G 0.586146 0 0.6364scaffold1014:55374 435 436 C T 0.585084 0.7581 0.1273 scaffold3831:62396437 438 A T 0.583082 0 0.6275 scaffold2449:58347 439 440 A T 0.5818740.6087 0.05833 scaffold1006:18102 441 442 C T 0.581751 0.03704 0.7scaffold143:176913 443 444 G A 0.581249 0.08333 0.7315scaffold2317:118609 445 446 G C 0.580874 0.8514 0.2034 scaffold3:20935447 448 T A 0.576848 0.6212 0.05556 scaffold543:54256 449 450 A T0.576198 0.7308 0.1182 scaffold360:162466 451 452 T C 0.575327 0.58820.03448 scaffold423:66668 453 454 A T 0.573557 0.01667 0.6604scaffold1317:10314 455 456 G A 0.572596 0.8676 0.2255 scaffold1517:10238457 458 C G 0.571584 0.01515 0.6471 scaffold1880:146873 459 460 C A0.571491 0.7571 0.1339 scaffold1297:57086 461 462 A G 0.571186 0.78330.1552 scaffold682:73395 463 464 C T 0.570734 0.7344 0.123scaffold92:214750 465 466 A T 0.569779 0.08333 0.7212 scaffold143:176964467 468 T A 0.567528 0.7778 0.1509 scaffold942:90233 469 470 T A0.566759 0 0.6293 scaffold942:89745 471 472 G A 0.566759 0 0.6293scaffold24:306989 473 474 T A 0.565856 0.8143 0.1827 scaffold1281:68808475 476 C G 0.564701 0.803 0.1735 scaffold1880:146829 477 478 A G0.56444 0.7273 0.1182 scaffold3:20953 479 480 A G 0.561743 0.59380.04918 scaffold580:4170 481 482 C T 0.559158 0.7429 0.13scaffold604:135874 483 484 T C 0.559151 0.8571 0.2241 scaffold1692:85945485 486 C G 0.55769 0.06667 0.7037 scaffold1041:93642 487 488 A G0.555826 0.1111 0.7407 scaffold1926:69537 489 490 C G 0.554742 0.93480.2845 scaffold4210:100547 491 492 G A 0.554584 0.9 0.2627scaffold759:86606 493 494 A C 0.553393 0.7069 0.1132 scaffold25:425571495 496 C T 0.553335 0.8571 0.2308 scaffold692:20843 497 498 G A0.552253 0.8281 0.2018 scaffold2515:21370 499 500 T C 0.551959 0.80650.1842 scaffold48:65442 501 502 C G 0.551514 0.8784 0.25scaffold48:65391 503 504 T C 0.551514 0.8784 0.25 scaffold3177:95787 505506 C T 0.550219 0.7727 0.161 scaffold3681:699 507 508 T C 0.5500460.01389 0.6311 scaffold3570:26136 509 510 C T 0.548187 0.7241 0.1311scaffold388:236449 511 512 C A 0.546671 0.8194 0.2 scaffold763:27574 513514 C A 0.546635 0.05556 0.6698 scaffold794:150222 515 516 A G 0.545590.06452 0.6909 scaffold152:25430 517 518 G T 0.545375 0.8243 0.2059scaffold108:362803 519 520 G A 0.545191 0.7222 0.1271 scaffold388:350925521 522 A C 0.544283 0.04054 0.6518 scaffold2218:32474 523 524 T C0.542777 0.7188 0.1271 scaffold604:135639 525 526 G C 0.542637 0.84720.2281 scaffold2741:33031 527 528 G A 0.542595 0.8194 0.2034scaffold4991:44264 529 530 A C 0.542051 0.01351 0.614 scaffold2483:94844531 532 G A 0.541013 0.8125 0.1961 scaffold3871:24039 533 534 T G0.540111 0.07143 0.6897 scaffold616:154319 535 536 T C 0.540002 0.060610.681 scaffold1005:75114 537 538 T A 0.539944 0.01562 0.6293scaffold4991:44417 539 540 T C 0.538537 0.01351 0.6121 scaffold38:121620541 542 G A 0.537429 0.8611 0.2455 scaffold1005:75091 543 544 A G0.536369 0.01562 0.6271 scaffold298:209613 545 546 A G 0.534826 0.071430.6864 scaffold152:25429 547 548 T C 0.534254 0.8243 0.2143scaffold1003:70848 549 550 T C 0.534107 0.02778 0.6333 scaffold839:64955551 552 A G 0.533035 0.7419 0.15 scaffold4070:22473 553 554 C G 0.5320190.06944 0.6731 scaffold1419:187408 555 556 T G 0.53116 0.8636 0.25scaffold3871:24091 557 558 G A 0.531082 0.07353 0.6864scaffold3871:23604 559 560 C A 0.531082 0.07353 0.6864scaffold3871:23603 561 562 G A 0.531082 0.07353 0.6864 scaffold3:388526563 564 C T 0.530397 0.5625 0.04839 scaffold4:729494 565 566 A G0.530117 0.5735 0.05 scaffold2283:29120 567 568 C G 0.528671 0.72860.1404 scaffold4350:64549 569 570 T C 0.528523 0 0.5784scaffold2283:29096 571 572 T A 0.527957 0.8235 0.2155 scaffold3469:17543573 574 A C 0.526758 0.6774 0.1091 scaffold1022:170497 575 576 A T0.526672 0.6111 0.07143 scaffold964:106240 577 578 C T 0.526638 0.88240.2705 scaffold1419:187788 579 580 T A 0.525371 0.8429 0.2368scaffold3386:35408 581 582 T A 0.524758 0.4848 0.01613scaffold575:241067 583 584 G C 0.524744 0 0.5877 scaffold1863:93953 585586 C G 0.524401 0.8571 0.25 scaffold575:459449 587 588 G C 0.5242020.8333 0.2288 scaffold773:155167 589 590 T C 0.523913 0.4483 0.008197scaffold773:155164 591 592 G A 0.523913 0.4483 0.008197scaffold107:154377 593 594 C T 0.522808 0.5862 0.06557 scaffold371:34271595 596 C T 0.522699 0.02857 0.625 scaffold78:341641 597 598 A G0.522194 0.7571 0.1667 scaffold156:85625 599 600 T A 0.521494 0.014290.5918

TABLE 9 SEQ SEQ SNP Name ID Upstream Sequence Minor Major IDDownstream Sequence scaffold 401 AAGGAAGAAGTTGGCAGCAAGCTCACCGTG A G 402TTCAACTGATTCTATTTCAATTCATAAATTTTTGTT 6803:13242GTATCCACTGAGGAGATCATCCTCGAGCGA TTTGAAAGTTGGATTATAATATTATATAGGTAGCAACCTAGGGCTCTCGGTACTTTATTATCATTC GGGGGAAGAAGCTAAGAATTCCGGGGAT CACAATGATTscaffold 403 AAGGAAGAAGTTGGCAGCAAGCTCACCGTG A G 404TTCAACTGATTCTATTTCAATTCATAAATTTTTGTT 543:54226GTATCCACTGAGGAGATCATCCTCGAGCGA TTTGAAAGTTGGATTATAATATTATATAGGTAGCAACCTAGGGCTCTCGGTACTTTATTATCATTC GGGGGAAGAAGCTAAGAATTCCGGGGAT CACAATGATTscaffold 405 TCATGTTACTGCTGATTATCGTCACCGATA A T 406CTGAGTGATTTAAGAGAACTTGCTTCAGCAGAGAGG 281:231978ATTGCTTCTAACCTTTCTACATTGTGGTTT CCTGGATACTTCACGCTCATTTACACAAATATTCTCTTTTGACCTGGCTGCAGAAAGTTGAAAACA GAACATCAACCTGACTGCCCGGTAACTT GAGTAAGAATscaffold 407 CCAGCAAAAGTGGCTCGTGCTGATGAGAGT G A 408TCAGATACACCAGTACCATCCACAATAACCCTGTGG 729:175391CGGGATTGCTTTTCCTGAAACCAGAGATAA CAGACAAGTTCTCAAGCTAGTAGGAATGAATGTCCTCTCTCACCCGGGCCAAAGAGGGGTGTTCTG CCGCAAATTGAAACCACTGTTCGGATCT AAACATGCAGscaffold 409 TTGTGGTTTTTTTGACCTGGCTGCAGAAAG G A 410CTCATTTACACAAATATTCTCGAACATCAACCTGAC 281:232029TTGAAAACAGAGTAAGAATACTGAGTGATT TGCCCGGTAACTTATCTGATACTACAGAAATTGCATTAAGAGAACTTGCTTCAGCAGAGAGGCCTG TTTGTTCTAAGCATTGCTTTCGGTTTTT GATACTTCACscaffold 411 CGATGTGTGGGTTTTCAGATCTCTTTTAAG A T 412TAAGATTAAAAACATAAACAAATATGATAACACTCA 2409:27009GCGTAAGAAAATTTTAATAAAGGAATTGAC TGCAGATTACAAGGAAAATGCACATGGCCACTGCAGTATATAGGCAGGCAGACAAGAAGATGGTGG GTAACAATGTTTGACTTTTTGTTAATTA CTTCACTTTCscaffold 413 ATGAAGCAGGCCGGAGAAAAGACGTTTTTC A T 414GAAGAGTTGCGAAGAGAACTCGCTTTTTGCACCTTA 25:303031CTGTTTTAAACAATCACTAAAGTTAGGAGT ATAATTTAACACTCAGCATTATTTCTTATTCGCAGATAGGAGAACTGAAAAACAAGTATGAGATGT GAAAGAGAGAGAGGCCTCTTCGGATGGA AATCAAATTAscaffold 415 TGCGACCCATAGCCTGTGAATCAGAAGATA A G 416GCAAAAGCTACATCTTCCGGGATGTTCTTGTCAAGG 63:301758TAATGCTGATTGCGCCCATATCATGCAAAA TGATGGCAGACCATCTGTATGTAGCAAGACGAAAGTTATCTTCTGCAGCAATTGTTTCTGCCCTTA GAGAATATTGAACAGCCATAGTGCACCT TTCTTGATTCscaffold 417 ATTGTTGGTGCATATGAGGAAGGCCACCTG A G 418GAAGCTCCCCATGACCCTAAATTCCCCCCAGGTTGA 92:214563CAGGTCCATAATATCCTTGCCAGTACATTG TACAAAGGCAATGAACCTTGAAAAGTGGACCCAGGAGCATAGCAAGCCCACCACCATTTGCACTCG AGTCCCAGTTGTGCTGTATGGGAACCAA GTGCAGGAGGscaffold 419 GCCACAACTCCCACCATCTTCCTCAGCCAC C T 420ATATGTGAAAAAAAAAGAACAGAAAATTTTCAAAAT 3286:29761CCTGCAGCTTCATTTCTTCTAGAAGCTGAG TGGAAGATTACAAGTACTGGCCATAAACACCAAACAGAAAAACATAAAAGACACAATTAGGAAACT CCAAAATTCTCAAGAGTCAGATCCTAAT AAATTAAATAscaffold 421 TAGAATGGTTTTGCTATTTGTATATGAATC A G 422TACGCTGATTCATGCAAACAAATCTTCTCAAGCTTC 823:18400ATGTTCACTTGAGTAGAAGTCTATACTCAG GTAGGCCAATCTTCTCCGAGAAGCAAAGGGTTCATTCAAAAGAATCTGAAGCAATCACTTATCTGC TCCATTACAACCCGGAAATCTATCATGG TCCCAGTTTTscaffold 423 CCAAAAACTTTTCATTTAACTGTTCTCATG C T 424AGCTGGTGCCTGGAAGTCAGTACTAACTGATAAGCT 1697:24776TTAAAACTGTGGAACAAGGAACGAGTAGCT ATTCTTTTCTTTCCTTCCAGAGTGTCTCTTCTAAAGGCAGCTGCGAAAGTTTTGCAGGTAAATTTG TGATGGAACCTTGGAGAATCGGCCGGTC AGTTTGATAAscaffold 425 CGCCCTGAGCACCCTTAATGGGAGTGGTTG A G 426TCATGGCATATTTCGCTATGTTTCTTCTGCGTCTGA 414:37377GAGCAGACCTCTTCTGCTGCAGCGAATGGA ATTTTCGACAGTCCGTTCCGTGCTGTCCCTTTCCCGGTTCGTCGATGGTTTGGGCCTTCACCGTTG GTCCGAAGCTGAACTCGCCGTTAGCGTT GCGCACTGTCscaffold 427 CCAATCAATCTGCAGTGACGAACCTCAGCT C T 428GCTTTTGCTTATCACCAACATCAACCACTGTCGTCG 5405:41764GAGACTGAGAAAACCAAACACAAATGACAG CAACCATCGTTTATTCAATGCGCACTGACGAAACAATATGTTTTCATGTAAGCTCTCCCCAAATCC GGCCACCGCTTCCTCTCCACTCTCTCAG CAGCCCCATTscaffold 429 GTATCCACTGAGGAGATCATCCTCGAGCGA A T 430TTTGTTTTTGAAAGTTGGATTATAATATTATATAGG 6803:13272CCTAGGGCTCTCGGTACTTTATTATCATTC TAGCAAGGGGGAAGAAGCTAAGAATTCCGGGGATCCCACAATGATTGTTCAACTGATTCTATTTCA ATTGAGTGCGAAAGGCGGTGCTGTTCTC ATTCATAAATscaffold 431 ACCAATGAGAAAAACAAAATTAGTTACACT T C 432TACCTGAAGTTTTATAGGATCAGGATTTGATCCATC 2317:118844TTTTCGGTTTTAACATCTCTAAACTTTAAA AATAATGTCAATAGCTCCATTGATCCGGAATTGCTCCCAATAAACACGTTCCAGCATTATAAGGGT CCAAGAGTCAGTGAAGTACCAACAAATC GTTCGTTCAGscaffold 433 AATATCATTTGCTATCTTGTAAAGCTCTAA A G 434GACACAAAATATAGCTCAATATAACTATACCGAATC 370:274852AGCTGCAGGTAGATTTTGTGCCTCCTTTCC AACATACAACAAACACAATCCAACGCACACTACAAACATTCCAACTGCTTGAGCACCCTGAAATAA AGGATTCAAGAATTATGTCTCCGACTAA GAAACAGACAscaffold 435 CTATTAGACTGTAGAGCATTGCATGAAAGG C T 436ACATTAGGCAACTCCTGCTGAATGGCAGTGCTTACC 1014:55374CTTACTTTATTGGTTTAAAAAGTATAGTTC TGTTCCGGAAATGTATGTGTTCACACAAAATCAGGCGGGTAGATGGATTCTGCAGATAAAGTTTCA TCAGTATGACCTTTATAATGTCAATTAG TCTTTCCCATscaffold 437 GAAAAAACACCCCTCCAGCCTGCAACATCA A T 438GTTGAGAAACAAACGAAACATTTAAATCCAAGGAAC 3831:62396ATTCAGTTGACAGCTTGCAGGACTCGAGCA ACGCAAGTCACAAATGCCTTGAACCGACACCATTCTAGTTTCGTACCAGTAACTCTGCAGATCACA TCCCAAGATAAAAATAACACGATGACAT GAAAGGTATCscaffold 439 CGATCATTCGTACCTGCAGTTGAAAGTTGA A T 440ACATGATGTTACTTTTTCAGTAACACTACCCCAACT 2449:58347ACCTTTTCAAGTTCAAATGTTGATGAATTA ACTTCCATTTAAGGTTGGTAACCCAATACTTCTACCACAATACAAGGAGACATTTGATCTTCATCC TTCATCACATGTTGATATTTTTCGATTC ATGTTTGTTGscaffold 441 TGTAAGACCAACGTGAACAGAGCTAAGGAC C T 442TCCACTAGGATTTTCCATAGCAATGCTTGGCTGCAG 1006:18102CATAAGAAGTTGAAGGCCGTTGAGGAACCA AGTGACATGCTTATTCGGAAGTAACCAGCGTGAAACGGCGAAACACTTGTCGTAATGAGTGAACGT CTGGTTAAAATAGATGATAATATGGGTT CGTCGTTTGGscaffold 443 CGTAATGCTGTTCTTGCTGTTATGTCTGTT G A 444CAGGATAATTCTAGTAAGCATAATGCTTTTCTTATG 143:176913TATCGACTTCCTGGTGGTGATCAATTGCTT CTTTTTACTTGTGATGAAGATCGTGCTGTTAATTACGTCGATGCACCGGAGATTATAGAGAAGTTT CTTTTTACACATGTTGATAGAATTACTG TATCTTCGGAscaffold 445 CGCTAGAATTGTCGAGAAAAGTTTCTTTAG G C 446GAAAACCATGATTTCTCTCTTTGCTGCATCATTCAG 2317:118609TCGGATCTTCATCTATACTTGGAAGACCTG AAAAGAAAAGACAAATATGTGGAAAATTTAGATAGTGATTAGGCCCCAAATACTGCAGCCTAGATT TCAAAGGGAAGTTAAAAGCTGAACTAGC TAAGAGAACTscaffold 447 GACTCCTCAACCTTTGATGGAGACGAAGAA T A 448ATCCGGGCATCATTTCAGTGCCTTATTTTATTCTTG 3:20935GATGAATTGTACCAACACTTAGTTTAAAAC GTGTAAAGATTAAAGATTAAAAATCAAAAATCAAAAAAAGCTTTTTACTAGATTTTCTAATCATTA ATCAAAATTCTTTCAGAAGATTTCAAGC TTATTTTGTGscaffold 449 GTATCCACTGAGGAGATCATCCTCGAGCGA A T 450TTTGTTTTTGAAAGTTGGATTATAATATTATATAGG 543:54256CCTAGGGCTCTCGGTACTTTATTATCATTC TAGCAAGGGGGAAGAAGCTAAGAATTCCGGGGATCCCACAATGATTGTTCAACTGATTCTATTTCA ATTGAGTGCGAAAGGCGGTGCTGTTCTC ATTCATAAATscaffold 451 TGGAAACAGGAATAGATGCCCCACCGGAAC T C 452CGCCGCTTCTTCTCTTCCTTCAATTGTTTCTTCATA 360:162466TTTCTTCTGTATTTTCACCTAAATCTCTGG AGAAGCTTTTCCCTATATTCTAACTCATCATAATAGGCAAATCCTTCACAGCTTCTGCCATTTTCT GCCTTCTTTTGAGCTTTGGAAAGCTTTG TCATAATTTTscaffold 453 GAAAAAACACCCCTCCAGCCTGCAACATCA A T 454GTTGAGAAACAAACGAAACATTTAAATCCAAGGAAC 423:66668ATTCAGTTGACAGCTTGCAGGACTCGAGCA ACGCAAGTCACAAATGCCTTGAACCGCACCATTCTTAGTTTCGTACCAGTAACTCTGCAGATCACA CCCAAGATAAAAATAACACGATGACATC GAAAGGTATCscaffold 455 GGAAGTAAAGCCAAACTAGGCCAAGAGAAA G A 456CAGTGTGCCAAGGTGTGTTCCTGCAGGTATTGAGCT 1317:10314AGGCGAAGGAGATGGAGCAATGTAGCATCC TAAGAAGTTGAAAATGTCCAATACACCAGATAGGACATAAAAGATGGAACAACTAGCATTAAAGAG AATGGACGATCTCCAAAGCTTCCACGTA CTCTGTGACAscaffold 457 CACATCATAAAGAGTATCACCAATCTTTAG C G 458ACTTCTTGACAGTGTTCGCAGCCAAAGGTGCCCGTG 1517:10238CTTGATGGCACCACTTCTGTATACAAGCAT TGCTTTCGTTTACTGGCTGGCCATCATTGGTAACTATTTACCCACCAAGCCTGCAGGTATCTCGTT ATTGTTTCATCAGAGGCATAGTTGGTGG CAATGCACAAscaffold 459 TACTAAAAATCTGGCCCGACAATGACGAAT C A 460GCAGGGTAGGGTAGAGAGCAGCAGCTGTACTATGTT 1880:146873TCAAATGAGTAATTCATCCCGGAAAAAAGA TACCCAAATTTGAAGCCACCAGGGGTAGGGTGGGTTGCAATAAAATAATTTCTTTAACAAAAAAAT GGCTTGCTCCAAAAGGAAAACCTTGTTG CCTGATCACAscaffold 461 GGTGCCATGAACTACCGGGATAGTTCCATA A G 462GCAACCAGATCAAGAATCAGTCAGTCCCATGAACAT 1297:57086TCTCATCGCATACAATTGATTCAAGCCACA ATTTCATTATGTTTAAATAGTGAAATAAAGAGTCACGGGCTCAAATCTCGAAGGCATCAGCAGTAT ATCCCAAGTTTTGCGTATCTATAAAATA ATCACAGCTGscaffold 463 TTCTACTTTTCCCCTTCACACCTCCCACAC C T 464CGCTCCTCGCCGCTGCAGATTATCTCTCCGATCGGG 682:73395CCCCACCGAGACTTCACTTTCATCATCAAG TACACCTCCACGTCTATATCGACCACTTCGCATTCGGTTCTCGCCTACAACCGCCTCGACTCGCTG TCAATGGCTCCACTGATGTGGATCGTAT GCCGCTGCCTscaffold 465 TGTATGGGAACCAATATCAGTCAAAGGCCC A T 466TTTTTTTAAAGCAAGTCAACATTTATTTACCATTTT 92:214750AGTAACAGCTGATGGCAAGCTTGAAGATGA GGAAGATGTATTATCAATAAGTTAAAAAAAACATTCGGTAACTGGGCGAGTATAGTGAGACTGCAA CCAATCTCCCATACCTGAATGATAGCCG CATAGATTCAscaffold 467 CAATTGCTTGTCGATGCACCGGAGATTATA T A 468GAAGATCGTGCTGTTAATTACCTTTTTACACATGTT 143:176964GAGAAGTTTTATCTTCGGAACAGGATAATT GATAGAATTACTGATTGGGTGAACAGCTTCAGATGGCTAGTAAGCATAATGCTTTTCTTATGCTTT TTGTGTTAGAATTGATTAAGAAAGTTTG TTACTTGTGAscaffold 469 GGAAACGAAAGGCACCACGCCAAACGTGAG T A 470GGCCGAGTAACACGCAGAGCAACCCGGCCACTGTCT 942:90233CGCTCCAGCTGCTTGAACTGACACAGAGAA GCACTGCCCAGGAACCCCAAAGTCTCCCTCTCAATTTCCACACAAGACAACAATGGAGCCCCATAG TTTCTTTCTATAACTTCTTTGCATTTTT CGAGTTACCTscaffold 471 CTTCGATTTCTGAGCTTACTAAGTGAGCCA G A 472AACCATTTGAAACAAATAGGTTCCTGATAGAAAGAA 942:89745TGCTGTTTGACGTGGCTTATCTGCAGTAAA GCCATCTCACATATCTTCTGTTGCCTTGGCAGATTTCGGTCCACTCCACTTTACAAGGTATTTTCT TCAGAGCAATGTACTTCATCGTTCAATG TAAAGTTTACscaffold 473 TAACGTTTCCATTCTTTATGTCTTGATTTT T A 474TTGTTAGACGATTTTAGCTTCTACGAAAGCCGAGAA 24:306989TTGCACAGGTTGGATTAAAGCAAGGACTGG AAGGTCCTCCTAGCCAAAAGAGCCATTAACATCAAGAAATGTTGAATATATTCAAGAGTTACACAC CCTGCAGCAACATTGAAGATGGAAACCA CGAAGACTTCscaffold 475 ACAGTACTAATGGAATCAACGGAGGAGATG C G 476TGGTACTACTTGCAAGGAAGAAGAGGCAGGTACAGA 1281:68808ACGGTGTTTGATTCTTCATCAATCGGTGAC AAACGACACCAATGATGACACAAACGACAACTCGAAGAGGGGGCTTTGGAGCAAATATTTCAGTAC AAACTGCAGATTTCTTCCAATCAACAAC ATGGGAAAGAscaffold 477 AAATACTTCAATATAACAAATTCCATAGAC A G 478AGAGCAATAAAATAATTTCTTTAACAAAAAAATCCT 1880:146829GAGCCTATGTGCCATACTAAAAATCTGGCC GATCACACGCAGGGTAGGGTAGAGAGCAGCAGCTGTCGACAATGACGAATTCAAATGAGTAATTCA ACTATGTTTACCCAAATTTGAAGCCACC TCCCGGAAAAscaffold 479 GGAGACGAAGAAGATGAATTGTACCAACAC A G 480TGCCTTATTTTATTCTTGGTGTAAAGATTAAAGATT 3:20953TTAGTTTAAAACAAAGCTTTTTACTAGATT AAAAATCAAAAATCAAAAATCAAAATTCTTTCAGAATTCTAATCATTATTATTTTGTGAATCCGGG GATTTCAAGCATTGTAGTAAATCTTTAT CATCATTTCAscaffold 481 GATCATAATGAGTACTTGATCTAGCTTGAG C T 482TGGTAGTTAGATTCAGAGTGAGAACAAAGGACTCTT 580:4170CACTGTTCATCCACAAGTTGACAGAAGCTA TGTTAGATGGCCATAGAGTAGTAAACACAGACTGGTAAACCTTTTCCTTCAAAAATCAGGCTGCAG CCAAGAGATTTACAAACCGTATTAATAT AAGTAGACAAscaffold 483 ACCAATGAGAAAAACAAAATTAGTTACACT T C 484TACCTGAAGTTTTATAGGATCAGGATTTGATCCATC 604:135874TTTTCGGTTTTAACATCTCTAAACTTTAAA AATAATGTCAATAGCTCCATTGATCCGGAATTGCTCCCAATAAACACGTTCCAGCATTATAAGGGT CCAAGAGTCAGTGAAGTACCAACAAATC GTTCGTTCAGscaffold 485 AAAAATAGTTTTGTTATATTATATCTTCTC C G 486AATGCAGTGATCATTGAGGCAGATGCTTTCAAAGAA 1692:85945ATGGAAAATAGCTAATAGTATTCATTTGAA TCAGATGTTATCTACAAAGCCCTTAGCTCTAGAGGCTTGAAATGTTGAACAGACCATTCTGGGCAG GAGCTGCAGG scaffold 487TATATAATATATATATATAAGATAATTAGA A G 488CACAAAATAGTTACACCCAAAGCAAATGTTGATATG 1041:93642TAACATAACATGTAATTAAGGATTAAGCAT ACCAAGCAAGGTAAAAAATATGGAAATCTGTTATTTGAAATTATTAATTTTAATTTTAACATACCG CATAATTATAAGGTTAATATGTCATTAA GCAGCCAATAscaffold 489 ACAATTGAAAAAGTTGTAATTCATTTATTG C G 490GCCATCTTTTTCACTTTTTTGTTTATTGTATTATAG 1926:69537AGGGATGGAGATCCCAACATAGAGTGAGCT TATAGGTACGTCGTAGAAGAAAAATGAGAAGAAGAGCCTTACCATCTGACCCACCCCTTTGCAGCT AATTAAGGAATAGGTGATATTATTATAT GCAGAACATGscaffold 491 GCTTGAGGAGAGAAGAGAGAGTAAGAAGGG G A 492AGCTCCATGGCCATAGTAGCTAGCTTGATCCGAGTA 4210:100547GAATGGAGCAACGATCACGTTCTGACATCA GCCTTATCAGACGGCAAGGCCTCCAAATATGGTATACACCCTTGATGGTACGGTTGTAAGCAACTC TTGAGATCAAAGAAACCCATTCCTACTC CGGAGTAACCscaffold 493 GGGGAGATAATTGTACGACATTCGATGTTA A C 494TGTGGTGTTGAATCCGGTGTCGTTTTGGCCCCTGCT 759:86606TGTCGTTTGGTGCAGTTGGAGATGGTGTGG GATTATTGTTTTAAATTACTTCTACTATCTTTTCTGCAGACGACACTGCAGCCTTTAAAGAGGCAT GACCCTGCAGCCCCGGATTAATGTTCCA GGAAAGCTGCscaffold 495 CCAATCAATCTGCAGTGACGAACCTCAGCT C T 496GCTTTTGCTTATCACCAACATCAACCACTGTCGTCG 25:425571GAGACTGAGAAAACCAAACACAAATGACAG CAACCATCGTTTATTCAATGCGCACTGACGAAACAATATGTTTTCATGTAAGCTCTCCCCAAATCC GGCCACCGCTTCCTCTCCACTCTCTCAG CAGCCCCATTscaffold 497 ATATTTAGCCCAAATGAAAAATTAGGAATT G A 498CTTTTGGGGTGTTATTAGCAAATAATTTTAAGCTGA 692:20843TTTCCTTTCACCTTTGATTTTGATTTTAAA ATTTTCTGCAGCTCTTGGCTGAGGACCCATCTTTAAATGTTATAACATTATGGCCGGCTAACTTAG AGCGATTCAAGTCACATAAGCCTAATGT CTATCGATTTscaffold 499 AAACGATCTGACGCCCTTCATCTTGATCAA T C 500CCTCGCCCTATTGAGCCAATTGCAATCTCTGCAGCT 2515:21370TCTCAATTAAACAAGATTTAGCTGAGTCCC TTTGGAGCTGGAACTTTCAGCGCCATAACTCACCAACTTTACAGTATTTGAGCCTTAGATCAATTG CGCCTGTTCCTTCACCAATCCCAGATCT AGATGTCGTAscaffold 501 TTTCTTCACCGTACATGTATAAGGAACAAC C G 502CCAGCGGGCAAGTTTATTCTGAGGAATCTGCAGAGA 48:65442CGGTCATAGTTTGAGAAGCCACAACCACAG TAAATTTCATAAATTTGAGGAATAAGTACCCAAAACATGACGCATCATCACAGAATTCAACTGCCA GAAGAAAATCCATAAATGATCATTGGAA CAGGATGACCscaffold 503 CAGGAAGAGGAGGCTTGGTTTGCTGGTTAG T C 504ACAACCACAGATGACGCATCATCACAGAATTCAACT 48:65391CTCCATCAGAAGATTTGGTCTTTTCTTCAC GCCACAGGATGACCGCCAGCGGGCAAGTTTATTCTGCGTACATGTATAAGGAACAACCGGTCATAG AGGAATCTGCAGAGATAAATTTCATAAA TTTGAGAAGCscaffold 505 TACCCCTCCCGGACTTCCTATGATGCGTCT C T 506TTCCCCCTTCCGGCGTATATATCGGACCAGCTCTCC 3177:95787GCAGCGTGTTTTTCAGGCTCGGAATCGTCG ATCAGCATCTCTCTCACGCACGCTACGTGTAGATTGACCTGGAAAACGAGGACGAAGACGAAGAAG TACTTCTTGCACTTAGACCGGTAGCTCC AAGTAATGGCscaffold 507 AGCCTCATTTGGCTCGATTTGAACAGCAAT T C 508AAGCAATTTGCATTTGTGAGAAATGAGGGTGGAACA 3681:699CAACTCTCTGGCACTGTCCCCGCCGACTTG GCCTGCAGGGGAGCCGGAGGACTAGTTGAATTCGAGCTAACCAGGCTGGCCTAGTGGTTCCCGGTA GGTGTTCGGCCTGAGAGACTAGAAAACT TTGTTTCTGGscaffold 509 TCTACTTTTCCCCTTCACACCTCCCACACC C T 510CGCTCCTCGCCGCTGCAGATTATCTCTCCGATCGGG 3570:26136CCCACCGAGACTTCACTTTCATCATCAAGG TACACCTCCACGTCTATATCGACCACTTCGCATTCGTTCTCGCCTACAACCGCCTCGACTCGCTGG TCAATGGCTCCTGATGTGGATCGTATGT CCCGCTGCCTscaffold 511 TAGAGAAGAGAAACTGTAATGCCTACCTAG C A 512TTCAAACCAAACAAAGAACCGGAACTTTCCTTATCC 388:236449TTTCACTCAAGTAAGAAGAGTCTGATTTTA TCATCATCTTCATCAATATCACTGTCTTCTTCACTACAGTTGAAGACGCTGCTGAACCCTTGGCAC AGCAAAGCCCCAATGCCCAACCTTTTTC TAGCTTCCTTscaffold 513 TCGGCGAGCAAGTATGTGTAGGTGGATGAT C A 514TGTTTTTAAACACCCTAACTTGGTCGACTTTACCGG 763:27574TCCTTCTCAAAGAGCTGACGGAAGAGGAGC ACTCGATAAAATCCTCGAAGAACAATTCAACAAAGCTTGCCAAAGAGATGAGACGACGTCGTATAA TGGATTAGATCCAGTAATGGTGAAGACT GAGCAAGTCGscaffold 515 GCCACCAAATGGGGGCATTGAGGCCAGGCT A G 516ACGGGATCACAGCCAACTGCGTAGCCCCGGGGCCGA 794:150222TTGGGCATACGCGCGGGTCAAAGGCTGCAG TCGCAACAGATATGTTCTATATGGGAAAGACTGAGGTGGAGTCTATGACGCAGATACTTGCAAAGG AACAAATTCAGAAAGCGGCGGAGGAAAA AGTTGAAGGGscaffold 517 ATATAAAATTCAATGCCAAGTGATTCAAAA G T 518TGTACGGTCATATTTCTTAAAACAAATAAATAAATA 152:25430CCGGTTTAATCTTCAATAAATCTTTATTAG AAATCACACAAGCCAATTGGAAAGAGGCTATTCAACTAATGGCTTGCATTCTTCAAAAAGAATCAA TTTATACAATATAGACCTATAATGTACT CCTTTACAATscaffold 519 CAAAGTACCACAGAGCCATTTCCAACATGC G A 520AACCAAAAGAGCCCAAGTAATACAAAAGCAAATGAC 108:362803CCAGCGTAATCACTAGAGTAATACAGTTCT ATGAGCCCATAAAATTTCATAAGTGGTGCCATTCTGGCAGAGGCAGAACTTCTCTCCAAAATCTTG CCGGTAGATATCCATTTGGGTTTTTCCA CATACTGTGAscaffold 521 ATCAGTAACCAGATTCACCCAAAAAGCTGC A C 522GATATCATGTACCTGAAAATTGACAATATGAACAGA 388:350925ACTGGTATCATACATTCGGTATTCCCAATG ACTATATAAGTTGCTCAACAATGAGATATTTCATATCTGCAGTCAAGAATATGGATATAACCTCTC AAGAAAGGGTATATATGCAGAACTAAGA CAACATTTGAscaffold 523 CTCCTGTGTACAGGCAGTACTTGAAGATGA T C 524ACCTTTCTACCAGGTAATCAAATTGGATACAACAAA 2218:32474TGGAACCTCATCTTAATTAATTGTTCCAAT GTAACCAAAACTCACAGGGTTCTCACTTCATGGAAGCCCACAAAAACTTTATGGAAAAGTTTGATT AATTCCAAAGACCCTTTACCCGGCCTTT ACCCTTCCCAscaffold 525 CCGCTAGAATTGTCGAGAAAAGTTTCTTTA G C 526GAAAACCATGATTTCTCTCTTTGCTGCATCATTCAG 604:135639GTCGGATCTTCATCTATATTGGAAGACGCT AAAAGAAAAGACAAATATGTGGAAAATTTAGATAGTGGATTGGCCCCAAATACTGCAGCCTAGATT TCAAAGGGAAGTTAAAAGCTGAACTAGC TAAGAGAACTscaffold 527 AATTTCGATCAGGTTGCAGCCAACTTTAAA G A 528ATTCCAGTAAAAAAACTTGATATTTATTTTTCAGGG 2741:33031CAAAATGTCTTTTGAGTTCTTACATCAATT CCGGGTCACCATTTCTCTGTCGATGGACTACAAGTTTGTATGAATAGGTTCACCTGGTCAGCAGAA CTAGATTCTTCATGGAACAACCTTTCAA GAAGAAATCTscaffold 529 GCAACATACATATAATTAATGTCAGCCGGT A C 530TCATTTCTAGTTGGAGCAACAAAGATAATATTAGAT 4991:44264GATAAAAACATTTCTATCAATAAATTTCTA ATCATCCAAGGAGATAGACACATATGACATTGATATATTAAGAACATCAAACACTTTGACCATTAC GTGTTATTACGTTTGAAGACCCGAGCAT AAATAATAAAscaffold 531 CTTAGGGCTAATCCCTCCAACCTGCATCAA G A 532CTCACTTTAAGTCCGGTCTCCTTTCAAACACAATTG 2483:94844CAAAACAATGCTAGATTTTAAGAAATCACA AGAAAGCCTGTCGAATAGACAAGTCTTCGGTGATTTAACACATTAATAAATTTCTTTGTAGCATCA CTTGTCTTACTTTATCTGTCTGCATTAT ATTGCAGTTTscaffold 533 ATAAATAAAGGCTAAAATGAGTTGAACTCA T G 534TTGATTTATGATGTTCTTTGACATTCTATAAAGATA 3871:24039ACACCACAAGAGTTTGTGGACACACTACAT AATCTATTTGTGCTCATATCTATCAAATATCTTCCACAAATGGGCCACAACATCTTTTACATTAGG CCTTACCGGAGATATTAATATGTGAGTA AAAATTTACAscaffold 535 TAGATCCACCTCCAATGAGGGGAAGAAAGC T C 536AGAATTGTAGCCTGTTACAAAAATTAGGAAAGATAT 616:154319TATACTTCTTAGCAAATTCTGCCCACCGGT TATTGACCAAAAATCACAAATTTGAATACATACGTATTAAAAGTCTATCCTCTGTGCCAACCAATG TCGATCATGAAAAAACAAGTTTATCATG CAGCAGATTTscaffold 537 ATATTATTTCCATCCAGATGTTTCCAGATC T A 538AACTGGGCATATAGTAATAACCAACTTTTATCAGTG 1005:75114CGGGTATTATAAAAATGCAAGTTAAAATCT GAGCACTGGTCATCATCTCATTCCTGCAGCAGGAATTGCTAATATTAGATTGAGAAAAAGATGGCA TCTGGTTAATATCACATACAATATCTCT TTGAAAAGGAscaffold 539 GACACATATGACATTGATATGTGTTATTAC T C 540CAGAAGTATTTGAGTCACACTTCATCAGTTTAGTGT 4991:44417GTTTGAAGACCCGAGCATAGTTTATTGAAA AAAAACAAGATCCTTACAGATACTGCAGCAACAGGATTAGCCAGCTGATCAGAAAAGTTAGATATG TAATAAATCCAACACTAACATGCTCCTC AGAAAACACAscaffold 541 ACTATGTCTTCTAAGGCGATTTTGTTTCGG G A 542ATGAATCTTTTGTAGGCCTTGGACACCTTTAACACA 38:121620TTGTTTAGGCGACGCGACCATGGATCTTCT GCAGTTATATCTCCGGTCTACTATGTTATGTGTTACCGTATGTAGTACTATTAGAATGTTTAATCT GTCGTTCACCATCTCGGAGCTGATCATG GTTTTACTTGscaffold 543 ATGTAAGGAAATAAACCTATGTAATATTAT A G 544AAAAAGATGGCATTGAAAAGGAAAACTGGGCATATA 1005:75091TTCCATCCAGATGTTTCCAGATCCGGGTAT GTAATAACCAACTTTTATCAGTGGAGCACTGGTCATTATAAAAATGCAAGTTAAAATCTTGCTAAT CATCTCATTCCTGCAGCAGGAATTCTGG ATTAGATTGAscaffold 545 CATGATAAACTTGTTTTTTCATGATCGATA A G 546AAATCTGCTGCATTGGTTGGCACAGAGGATAGACTT 298:209613CGTATGTATTCAAATTTGTGATTTTTGGTC TTAAACCGGTGGGCAGAATTTGCTAAGAAGTATAGCAATAATATCTTTCCTAATTTTTGTAACAGG TTTCTTCCCTCATTGGAGGTGGATCTAC CTACAATTCTscaffold 547 AATATAAAATTCAATGCCAAGTGATTCAAA T C 548GTGTACGGTCATATTTCTTAAAACAAATAAATAAAT 152:25429ACCGGTTTAATCTTCAATAAATCTTTATTA AAAATCACACAAGCCAATTGGAAAGAGGCTATTCAAGTAATGGCTTGCATTCTTCAAAAAGAATCA CTTTATACAATATAGACCTATAATGTAC ACCTTTACAAscaffold 549 AGCCTCATTTGCTCGATTTGAACAGCAATC T C 550AAGCAATTTGCATTTGTGAGAAATGAGGGTGGAACA 1003:70848AACTCTCTGGCACTGTCCCCGCCGAGCTTG GCCTGCAGGGGCCGGAGGACTAGTTGAATTCGAGGGCTAACCAGGCTGGCCTAGTGGTTCCCGGTA TGTTCGGCCTGAGAGACTAGAAAACTCT TTGTTTCTGGscaffold 551 CCAGAAGTATTCCATGTTCATTTGCACTAT A G 552TATATTATATTATAAATCTCAAGAGCTGCAGAGATT 839:64955CGACCTACATCATTGAAATGAACCAGGAAC GAAAAAAAAAAAGTCATACCCTAATTACCGTAGCATAAAAATATCTTGACTTAATTAAAACAATAT TCTTGCAAGAATCATTATCAATTACAAC AGATATATATscaffold 553 TGTGAAAACCTTTCCGTGGTATCAATCCAA C G 554TTCTCTCACACAAGCATTCTTGGGGATTGCATAAAA 4070:22473TACTTTGGTTTTTGCTTGCAGTCACCCAAT ATAATCTTCCTCCTTTGGCCTGTCTGCAATTACTCTTGCAGTTGAGTTACCGGGGCTAATCAAGTC GCGTCGACTTGGTTGTCCCATCCTTTTC ATCGCAACTTscaffold 555 GTTACACCATGATGGTGAACTCTCCGGAAC T G 556TAACATTTAATATTGTACTAATTCTAATTACACAGA 1419:187408TTGGCCATTGGACATATTGCACACCTACAA TACCTTCAAATTTCTGTAAACTGGCTTTGCATACAAATCAAACTTTGGAATAAGACAACTATCAAC ATGATCAATACCAAGAAAGTAACGATCG AATTACATATscaffold 557 CACTACATCAAATGGGCCACAACATCTTTT G A 558TATCTATCAAATATCTTCCACCTTACCGGAGATATT 3871:24091ACATTAGGAAAATTTACAGTTGATTTATGA AATATGTGAGTATGCTCAAATAATATGACCAGAAAATGTTCTTTGACATTCTATAAAGATAAATCT TATAACAAAATCAATATTCATCGACTAC ATTTGTGCTCscaffold 559 CCAATTGTTTCTTCACAATACCACACACCT C A 560CAAATCATTACCTTTACACAAGGGTTGATAGCAGAA 3871:23604TCTCCACTGTTTCTTTCTTGGCCTGCAGAA ACATCAATGATGCTGTGTCATAAGGGCTAAACACAAGGATGCCATACATAATTAGCTCCAAAATAA AGTAACGGATGTCAATGGAACAACTACA AAGTAAAAAAscaffold 561 GCCAATTGTTTCTTCACAATACCACACACC G A 562ACAAATCATTACCTTTACACAAGGGTTGATAGCAGA 3871:23603TTCTCCACTGTTTCTTTCTTGGCCTGCAGA AACATCAATGATGCTGTGTCATAAGGGCTAAACACAAGGATGCCATACATAATTAGCTCCAAAATA AAGTAACGGATGTCAATGGAACAACTAC AAAGTAAAAAscaffold 563 CGAGACGAGACGCGACCGTGGTTTTGAAAC C T 564TTCTTTTCGCGGAAACCAAACAGGAAATACAATTGA 3:388526CACAACAACAACAACTAGGGTTCATCCCTA CCATGTCGAAGAGGAAATTCGGATTCGAAGGCTTTGCCAAAAACCCCTCTTTACAATTCCTGCAGA GCATAAACCGCCAAACGACTTACAACTT TTTACCGCTTscaffold 565 GATTACCTTACAAACCGCATTCAAAATGGA A G 566GTTAGTTTAGTCTGTTCAAATTTTCCGAATCTGAAG 4:729494GGAACTGAGGTCGTCGAGGTAAATTTCTTT TAGTATGACTTTTGGTATCTCACTGAAATGTTCATTATTTCTTTTTAGTTTGATGAATGCCTGCAG TTGGTATCTCAGTAAACCAAGTAATTGC TTCATTTGGAscaffold 567 CTGAGCTCCATTATCTATATAAACTCATGA C G 568ATAAGATACCTTGTTTAAAACCAGAAAGATAAAATA 2283:29120GCACTTCCGGACGACTTCTCAACGATATCC TAAGTTTTCGTTGTGATGAGTTTTTGGCATTTTCTAACGGCTTCAAGGTAACAATAGTTTATCCTT ATCACTCTATTGGTTATGCTGCAGAGAG TCCACTGGTCscaffold 569 CTGGAACCACTAGCGAAGAAGCAGCGAAAG T C 570ACACAACTCCTGGAGACAAAAATGTTCTGATCAGGT 4350:64549TGACATCATTGCATTCTAATTGTTTGCCCC TAGAGATGCTCATGTGTCACATATTTATTTCGGGTAGGATATAGAATTGGGGCCCGATCACCCAAT GCTGGGTAGCCACTCCTTTTTTCACAGA TGTATGTAACscaffold 571 TTTCTGAGGATGTTACAAGACATCCTGAGC T A 572ATAGTTTATCCTTTCCACTGGTCGATAAGATACCTT 2283:29096TCCATTATCTATATAAACTCATGAGCACTT GTTTAAAACCAGAAAGATAAAATATAAGTTTTCGTTCCGGACGACTTCTCAACGATATCCACGGCT GTGATGAGTTTTTGGCATTTTCTAATCA TCAAGGTAACscaffold 573 AGGGGGATAATTGTACGACATTCGATGTTA A C 574TGTGGTGTTGAATCCGGTGTCGTTTTGGCCCCTGCT 3469:17543TGTCGTTTGGTGCAGTTGGAGATGGTGTGG GATTATTGTTTTAAAATTACTTCTACTATCTTTTCTCAGACGACACTGCAGCCTTTAAAGAGGCAT GGACCCTGCAGCCCCGGATTAATGTTCC GGAAAGCTGCscaffold 575 ACCCGTGATTCTCTGAAATCATTTTATTTT A T 576GTATTTGTATGTTTGTGTAAGATAAATCATAAATTT 1022:170497CCGTGCCTTATTACATAAGGAAGGAAGAAA CTAACGAACTCTTTAAACCACAGTTCCATGACCGCCAGAACTCGTATTTTGGGTTCTTTCCATCTT ACCACCTTCATCGACCGCCCGGTTATAT CGTATGTGTTscaffold 577 GAGTAGGAATGGGTTTCTTTGATCTCAATA C T 578GGTTACTCCGGAGTTGCTTACAACCGTACCATCAAG 964:106240TACCATATTTGGAGGCCTTGCCGTCTGATA GGTGTGATGTCAGAACGTGATCGTTGCTCCATTCCCAGGCTACTCGGATCAAGCTAGCTACTATGG CTTCTTACTCTCTCTTCTCTCCTCAAGC CCATGGAGCTscaffold 579 ATCATAAACTTATTTGGCTTAATCTGAAAA T A 580ACCATGTAAACAGAGGTATCATTTGATGGAAGTGAA 1419:187788TAGCCAACATAAATGATGAAATAACAGAAG GAGGCTGCAGGTAACACTTTGAATTTCCAATCAGGAAAAAGCAAGTAAATTTAAGAATTCAGCGAT ACGTATAGACCATATAATAACCCCGTGT GAAATTATCTscaffold 581 TTTCTAGCATTGGAGATGACATTGATATGG T A 582GGGCTGCAGACTTTATAAGCAGCAATTCTTGCAGAG 3386:35408TAATGATGTCGACACCATCAGCAATAGCGT GGAACTCCTCCTCTTGCAATGCCTTCTTTTAGTCCACGTCAAAAGCTGCCAACATAGCTTCTGAAT TAGAAACTGACATTTTTTACATTGTTCC AGCACCCTGTscaffold 583 TATAATCTGCTCTTCTTGTTAGTATGTTTG G C 584AGGATTGTGTTTGACATTTTCTTATCTTGGTGGAAA 575:241067CCTTTTATATTCCGGGTAGATTTGTCTTAT CATTTTTAAGCAAACACGTGTTGCAGGCTTGACTAATTAAAAACCCTTTCTGTGAGAAGACTTTTA TCTGATTTTCACGTGGAAGATTGAACTT TATGTATAGTscaffold 585 AGAATTTATACAGTAGAACCTCACTAATAT C G 586TTGTTTATAAAATTTTAATCATTAAGAAACACAGCA 1863:93953ATACGAAGAATAACCTCAAAAATTAAATCT AATAAACCTAACAAGTATTGAAAAATGTCCATTAAATGGGCAGTTATAAATAAAAATAATTTTTAA AGAAACCATCGTTCTTTCCGGTCTTGAA ATTGTAAAACscaffold 587 GACCATAAGACACTGCAGAGCTATGGCTCG G C 588CCCCCCGGATCAGCACCGTGACCATGTTCTCCCGAG 575:459449AATCACCACCATTTCCATTTTCTGTTGAGT TTCGAATGTGAACTATTATTCCCTCTGTATCGATTATCAAATGCAAATTAGCATTCCCCCTGTACT GACTATACGAAGCTGAAACACTGTCATT GATAACTATGscaffold 589 TCTGTTTAAGGATTTCTGCAGCAATCGGGA T C 590AGTGTCTTTACTATACCGGCATTGATGTTGAATAGG 773:155167CGGTTGAGTTGACTGGATTACTGATTATGT TCATCACGAGTCATTCCAGGCTTTCTCGGAACTCCAGGATAAAGGCATCAGGGCAGTTGTCAGCAA GCAGGAATGACAACAACATTTACGCCTT CAGCCTCAATscaffold 591 CCTTCTGTTTAAGGATTTCTGCAGCAATCG G A 592ATCAGTGTCTTTACTATACCGGCATTGATGTTGAAT 773:155164GGACGGTTGAGTTGACTGGATTACTGATTA AGGTCATCACGAGTCATTCCAGGCTTTCTCGGAACTTGTGGATAAAGGCATCAGGGCAGTTGTCAG CCAGCAGGAATGACAACAACATTTACGC CAACAGCCTCscaffold 593 TATTTGAATCCACTCAAGTCCACTTCACAA C T 594CAACTCAAGTCCATGTAGTTGGTTGTTGATACTTGT 107:154377TAAGTGGTTACAAGCTCACGCGGCCGGTGG GATTTACTTATATAAGAGTAACGATATGTGCACTCATGTGAAGCTCACAATGCAACATTTGAGCCT AATTATGCATCAAAATATTAGATACAAT TTGTAGGGATscaffold 595 GAACACACATCTTTCTCCGGCAACTTTAGT C T 596ACCTTCAACTTCCTAATTGACAATGCATTTGGTAAT 371:34271ACATTGCCGTTCAGTGCTTGCAGAAGCTTC GATTTTTTTTTTCTTCAGGTTACCTTGAAATTGTTCCTTCCAATAGCAGCAAGTACACATATTCGT ACATTTATAGCAGCATGAATGGTTGAAA GTGATGGATAscaffold 597 TAATAGTTCTTCTATTGCTGGTAATCGTCT A G 598TTCTTCGCATCAGGTTCAGTTGCCCTTTATGTATTC 78:341641GTGCTGAAGTGTCGGTGGTCCTCACCTACA TTGTACTCTATCAATTACTTGGTGTTTGACCTGCAGTGCATCTCTGCGTAGAGGATTGGCGGTGGT AGTTTGAGTGGACCCGTCTCCGCTGTAC GGTGGAAGGCscaffold 599 CCTATGTGATCGTCGCGAATGTGTTATCAA T A 600GGGCGTAGTTGGATTGAAGTGCGCGGTAAGGTCCAC 156:85625GTGCAGGAAGGTGGGAAGAAGTTTCTGAAG GAGTTCTTAGCCGGCGATCACATACACGAAATGAGGTGAGGAAGATGATGAAAAACCAGAGGGTGA GACGATATTTATAAGAAACTAACCGAGT AGAAGGAAGT

TABLE 10 Sample Sample Sample ID Name Type Reference for ReportedAncestry 1 Afghani indica http://medicannseeds.com/seed/afghani-regular/Regular 2 Afghani indica http://medicannseeds.com/seed/afghani-regular/Regular 3 Afghani indica http://medicannseeds.com/seed/afghani-regular/Regular 4 Afghani indica http://medicannseeds.com/seed/afghani-regular/Regular 5 Hindu indicahttps://www.kiwiseeds.com/kiwiseeds/kiwiseeds-hindu-kush-34671.html Kush6 Pakistan indicahttps://www.cannabiogen.com/Producto/PAKISTAN%20CHITRAL%20KUSH%20-6.htmlChitral Kush 7 Pakistan indicahttps://www.cannabiogen.com/Producto/PAKISTAN%20CHITRAL%20KUSH%20-6.htmlChitral Kush 8 Ketama indicahttp://www.worldofseeds.eu/wos_en/ketama.html 9 Pure indicahttp://dnagenetics.com/seeds/pure-afghan Afghan 10 Enemy Of indicahttp://superstrains.biz/shop/enemy-of-the-state/ The State 11 Enemy Ofindica http://superstrains.biz/shop/enemy-of-the-state/ The State 12Enemy Of indica http://superstrains.biz/shop/enemy-of-the-state/ TheState 13 Master indicahttp://homegrown-fantaseeds.com/product/masterkush Kush 14 Master indicahttp://homegrown-fantaseeds.com/product/masterkush Kush 15 Master indicahttp://whitelabelseeds.com/seeds/wlsc/master-kush Kush 16 Master indicahttp://whitelabelseeds.com/seeds/wlsc/master-kush Kush 17 Master indicahttp://whitelabelseeds.com/seeds/wlsc/master-kush Kush 18 Master indicahttp://whitelabelseeds.com/seeds/wlsc/master-kush Kush 19 Master indicahttp://whitelabelseeds.com/seeds/wlsc/master-kush Kush 20 Pakistanindica http://www.worldofseeds.eu/wos_es/pakistan-valley.html Valley 21Pakistan indica http://www.worldofseeds.eu/wos_es/pakistan-valley.htmlValley 22 Afghani indica http://homegrown-fantaseeds.com/product/afghani23 Afghani indica http://homegrown-fantaseeds.com/product/afghani 24Northern indicahttp://www.ministryofcannabis.com/feminized-cannabis-seeds/northern-lights-moc-feminizedLights 25 Northern indicahttp://www.ministryofcannabis.com/feminized-cannabis-seeds/northern-lights-moc-feminizedLights 26 Northern indicahttp://www.ministryofcannabis.com/feminized-cannabis-seeds/northern-lights-moc-feminizedLights 27 Hash indica http://www.seedsman.com/en/hash-passion-seedsPassion 28 Narkush indica http://www.seedsman.com/en/narkush-seeds 29Narkush indica http://www.seedsman.com/en/narkush-seeds 30 Narkushindica http://www.seedsman.com/en/narkush-seeds 31 Narkush indicahttp://www.seedsman.com/en/narkush-seeds 32 Kush indicahttp://www.ceresseeds.com/online/en/regular-seeds/182-ceres-kush.html 33Kush indicahttp://www.ceresseeds.com/online/en/regular-seeds/182-ceres-kush.html 34Kush indicahttp://www.ceresseeds.com/online/en/regular-seeds/182-ceres-kush.html 35Hindu indicahttps://sensiseeds.com/en/cannabis-seeds/sensi-seeds/hindu-kush Kush 36Hindu indicahttps://sensiseeds.com/en/cannabis-seeds/sensi-seeds/hindu-kush Kush 37Hindu indicahttps://sensiseeds.com/en/cannabis-seeds/sensi-seeds/hindu-kush Kush 38Malawi sativa http://www.aceseeds.org/en/malawifem.html 39 Malawi sativahttp://www.aceseeds.org/en/malawifem.html 40 Guatemala sativahttp://www.aceseeds.org/en/guatemalastd.html 41 Guatemala sativahttp://www.aceseeds.org/en/guatemalastd.html 42 Guatemala sativahttp://www.aceseeds.org/en/guatemalastd.html 43 Malawi sativahttp://www.seeds-of-africa.com/malawi-gold/ Gold 44 Malawi sativahttp://www.seeds-of-africa.com/malawi-gold/ Gold 45 Malawi sativahttp://www.seeds-of-africa.com/malawi-gold/ Gold 46 Malawi sativahttp://www.seeds-of-africa.com/malawi-gold/ Gold 47 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 48 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 49 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 50 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 51 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 52 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 53 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 54 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 55 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 56 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 57 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 58 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 59 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 60 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 61 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 62 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 63 Zimbabwe sativahttp://www.seeds-of-africa.com/zimbabwe/ 64 Zimbabwe sativahttp://www.seeds-of-africa.com/zimbabwe/ 65 Zimbabwe sativahttp://www.seeds-of-africa.com/zimbabwe/ 66 Lao sativahttp://original-ssc.com/laos-luang-prabang-lao-sativa-seeds-ace-seeds.htmlSativa 67 Lao sativahttp://original-ssc.com/laos-luang-prabang-lao-sativa-seeds-ace-seeds.htmlSativa 68 Purple sativahttp://en.seedfinder.eu/strain-info/Purple_Haze/ACE_Seeds/ Haze 69Purple sativa http://en.seedfinder.eu/strain-info/Purple_Haze/ACE_Seeds/Haze 70 Zimbabwe sativa http://www.seeds-of-africa.com/zimbabwe/ 71Malawi sativa http://www.seeds-of-africa.com/malawi-gold/ Gold 72 Malawisativa http://www.seeds-of-africa.com/malawi-gold/ Gold 73 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 74 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 75 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 76 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 77 Durban sativahttp://www.seeds-of-africa.com/durban-magic/ Magic 78 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 79 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 80 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 81 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 82 Pondo sativahttp://www.seeds-of-africa.com/pondo-mystic/ Mystic 83 Coffee sativahttp://www.seeds-of-africa.com/coffee-gold/ Gold 84 Coffee sativahttp://www.seeds-of-africa.com/coffee-gold/ Gold 85 Coffee sativahttp://www.seeds-of-africa.com/coffee-gold/ Gold 86 Coffee sativahttp://www.seeds-of-africa.com/coffee-gold/ Gold 87 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 88 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 89 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 90 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 91 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 92 Mozambica sativahttp://www.seeds-of-africa.com/mozambica/ 93 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 94 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 95 Transkei sativahttp://www.seeds-of-africa.com/transkei/ 96 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 97 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 98 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 99 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold 100 Swazi sativahttp://www.seeds-of-africa.com/swazi-gold/ Gold

While the present application has been described with reference to whatare presently considered to be the preferred examples, it is to beunderstood that the application is not limited to the disclosedexamples. To the contrary, the application is intended to cover variousmodifications and equivalent arrangements included within the spirit andscope of the appended claims.

All publications, patents and patent applications are hereinincorporated by reference in their entirety to the same extent as ifeach individual publication, patent or patent application wasspecifically and individually indicated to be incorporated by referencein its entirety. Specifically, the sequences associated with eachaccession numbers provided herein including for example accessionnumbers and/or biomarker sequences (e.g. protein and/or nucleic acid)provided in the Tables or elsewhere, are incorporated by reference inits entirely.

CITATIONS FOR REFERENCES REFERRED TO IN THE SPECIFICATION

-   1. Hillig K. Genetic evidence for speciation in Cannabis    (Cannabaceae). Genet. Resour. Crop Evol. 2005; 52(2):161-80.-   2. de Meijer E P M. The Chemical Phenotypes (Chemotypes) of    Cannabis. In: Pertwee R G, editor. Handbook of Cannabis. Handbooks    in Psychopharmacology: Oxford University Press; 2014. p. 89-110.-   3. van Bakel H, Stout J, Cote A, Tallon C, Sharpe A, Hughes T, et    al. The draft genome and transcriptome of Cannabis sativa. Genome    Biol. 2011; 12(10):R102.-   4. Bostwick J M. Blurred Boundaries: The Therapeutics and Politics    of Medical Marijuana. Mayo Clin. Proc. 2012; 87(2):172-86.-   5. Elshire R J, Glaubitz J C, Sun Q, Poland J A, Kawamoto K, Buckler    E S, et al. A Robust, Simple Genotyping-by-Sequencing (GBS) Approach    for High Diversity Species. PLoS ONE. 2011; 6(5):e19379.-   6. Raj A, Stephens M, Pritchard J K. fastSTRUCTURE: Variational    Inference of Population Structure in Large SNP Datasets. Genetics.    2014.-   7. de Meijer E P M, Bagatta M, Carboni A, Crucitti P, Moliterni V M    C, Ranalli P, et al. The Inheritance of Chemical Phenotype in    Cannabis sativa L. Genetics. 2003; 163(1):335-46.-   8. Piluzza G, Delogu G, Cabras A, Marceddu S, Bullitta S.    Differentiation between fiber and drug types of hemp (Cannabis    sativa L.) from a collection of wild and domesticated accessions.    Genet. Resour. Crop Evol. 2013; 60(8):2331-42.-   9. Hinds D A, Stuve L L, Nilsen G B, Halperin E, Eskin E, Ballinger    D G, et al. Whole-Genome Patterns of Common DNA Variation in Three    Human Populations. Science. 2005; 307(5712):1072-9.-   10. Hazekamp A, Fischedick J T. Cannabis—from cultivar to chemovar.    Drug Test Anal. 2012; 4(7-8):660-7.-   11. Small E, Cronquist A. A Practical and Natural Taxonomy for    Cannabis. Taxon. 1976; 25(4):405-35.-   12. Salentijn E M J, Zhang Q, Amaducci S, Yang M, Trindade L M. New    developments in fiber hemp (Cannabis sativa L.) breeding. Ind Crops    Prod. 2014.-   13. Franz-Warkentin P. Hemp production sees steady growth in Canada    2013 [cited 2014]. Available from:    http://www.agcanada.com/daily/hemp-production-sees-steady-growth-in-canada.-   14. Agricultural Act of 2014, Pub. L. No. 113-17 Stat. 128 (Feb. 7,    2014, 2014).-   15. Sonah H, Bastien M, Iquira E, Tardivel A, Légeré G, Boyle B, et    al. An Improved Genotyping by Sequencing (GBS) Approach Offering    Increased Versatility and Efficiency of SNP Discovery and    Genotyping. PLoS ONE. 2013; 8(1):e54603.-   16. Gardner K M, Brown P, Cooke T F, Cann S, Costa F, Bustamante C,    et al. Fast and Cost-Effective Genetic Mapping in Apple Using    Next-Generation Sequencing. G3 (Bethesda). 2014; 4(9):1681-7.-   17. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M A R,    Bender D, et al. PLINK: A Tool Set for Whole-Genome Association and    Population-Based Linkage Analyses. Am J Hum Genet. 2007;    81(3):559-75.-   18. Jombart T, Ahmed I. adegenet 1.3-1: new tools for the analysis    of genome-wide SNP data. Bioinformatics. 2011; 27(21):3070-1.-   19. Weir B S, Cockerham C C. Estimating F-Statistics for the    Analysis of Population Structure. Evolution. 1984; 38(6):1358-70.-   20. Hilling K W, Mahlberg P G. A Chemotaxonomic Analysis of    Cannabinoid Variation in Cannabis (Cannabaceae). American Journal of    Botany. 2004, 91(6):966-975-   21. Willing E-M, Dreyer C, van Oosterhout C. Estimates of Genetic    Differentiation Measured by F_(ST) Do Not Necessarily Require Large    Sample Sized When Using Many SNP Markers. PLOS ONE. 2012,    7(8):e42649.-   22. McClure K A, Sawler J, Gardner K M, Money D, Myles S. Genomics:    A Potential Panacea for the Perrenial Problem. Am J Botany 2014    101:1780-90.-   23. Paetkau D, Calvert W, Stirling I, Strobeck C (1995)    Microsatellite analysis of population structure in Canadian polar    bears. Molecular Ecology, 4, 347-354-   24. Hansen M M, Kenchington E, Nielsen E E (2001) Assigning    individual fish to populations using microsatellite DNA markers:    Methods and applications. Fish and Fisheries, 2, 93-112.-   25. Campton D E and Utter F M, 1985. Natural hybridization between    steelhead trout (Salmo gairdneri) and coastal cutthroat trout (Salmo    clarki clarki) in two Puget Sound streams. Can J Fish Aquat Sci    42:110-119.-   26. Poland J A, Brown P J, Sorrells M E, Jannink J L. Development of    High-Density Genetic Maps for Barley and Wheat Using a Novel    Two-Enzyme Genotyping-by-Sequencing Approach. PLoS ONE. 2012;    7(2):e32253.-   27. Melo A T, Bartaula R, Hale I. GBS-SNP-CROP: a reference-optional    pipeline for SNP discovery and plant germplasm characterization    using variable length, paired-end genotyping-by-sequencing data. BMC    Bioinformatics. 2016; 17:29

The invention claimed is:
 1. A method for testing a sample comprisingcannabis to determine if the cannabis is hemp or marijuana, the methodcomprising: I) obtaining a test sample comprising genomic DNA, II)genotyping the test sample for a set of SNPs, the set comprising atleast 10, 20, 30, 40, 48, 50, 60, 70, 80, 90, 96 or 100 of the SNPs inTable 5, each SNP comprising a major allele and a minor allele asprovided in Table 5; III) detecting for each SNP of the set the presenceor absence of the major allele and/or the minor allele in the testsample; IV) determining the sample is hemp or marijuana according to theset of SNPs, wherein at least 10, 20, 30, or 40 of the genotyped SNPs ofthe set have an FST of greater than 0.679 as provided in Table 5, and atleast 2 of the genotyped SNPs of the set have a major allele or minorallele with an allele frequency of 0 as provided in Table 5, wherein thetest sample is determined to be hemp if the major alleles and/or minoralleles in combination when compared to the reference profiles providedin Table 5 are most similar to major alleles and/or minor alleles morecommonly found in hemp, or the test sample is identified as marijuana ifthe major alleles and/or minor alleles in combination when compared tothe reference profiles provided in Table 5 are most similar to majoralleles and/or minor alleles more commonly found in marijuana; and V)displaying and/or providing a document displaying one or more featuresof the major and/or minor allele for each SNP in the set and/or theidentity of the test sample as hemp or marijuana, wherein one or morefeatures and/or identity of the test sample is used to select a samplewith desired combination of major alleles and/or minor alleles, whereinthe test sample is selected from a genomic DNA sample, a plant sample, aseed sample, leaf sample, a flower sample, a trichome sample, a pollensample, and a sample of dried plant material including flower, pollenand trichomes, and the obtaining the sample comprises isolating genomicDNA, wherein the method further comprises using the presence or absenceof the major allele and/or the minor allele for each SNP of the set inmarker assisted selection (MAS) to select a cultivated cannabis plant,crossing the cultivated cannabis plant with a wild type cannabis plant,and selecting an offspring cannabis plant with a desired combination ofmajor alleles and/or minor alleles.
 2. The method of claim 1, whereinthe set of SNPs comprises the SNPs in Table 5 with an Fst of greaterthan 0.679.
 3. The method of claim 1, wherein the genotyping methodcomprises a PCR based method.
 4. A method of cannabis ancestry selectionbreeding, the method comprising: a) obtaining one or more cannabis plantoffspring having a desired trait; b) determining the hemp or marijuanaancestry contribution of the cannabis plant offspring; c) selecting oneor more cannabis plant offspring having a desired hemp or marijuanaancestry contribution; and d) crossing the selected cannabis plantoffspring with cultivated hemp or marijuana; wherein determining thehemp or marijuana ancestry contribution of the cannabis plant offspringin step b) comprises: I) obtaining a sample comprising genomic DNA fromthe cannabis plant offspring, II) genotyping the sample for a set ofSNPs, the set comprising at least 10, 20, 30, 40, 48, 50, 60, 70, 80,90, 96 or 100 of the SNPs in Table 5, each SNP comprising a major alleleand a minor allele as provided in Table 5; III) detecting for each SNPof the set the presence or absence of the major allele and/or the minorallele in the sample; and IV) determining the hemp or marijuana ancestrycontribution of the cannabis plant offspring according to the set ofSNPs based on the reference profiles for marijuana and/or hemp providedin Table 5; wherein at least 10, 20, 30, or 40 of the genotyped SNPs ofthe set have an FST of greater than 0.679 as provided in Table 5, andwherein at least 2 of the genotyped SNPs of the set have a major alleleor minor allele with an allele frequency of
 0. 5. The method of claim 4,wherein the cannabis plant offspring is F2 offspring obtained by:crossing an initial cultivated hemp or marijuana strain with a wildcannabis strain having a desired trait to obtain F1 offspring having thedesired trait; and backcrossing the F1 offspring having the desiredtrait to the initial cultivated hemp or marijuana strain to obtain F2offspring having the desired trait.
 6. The method of claim 4, whereinthe set of SNPs comprises the SNPs in Table 5 with an Fst of greaterthan 0.679, and wherein the method further comprises displaying and/orproviding a document displaying one or more features of the major and/orminor alleles.
 7. The method of claim 4, wherein the test sample isselected from a genomic DNA sample, a plant sample, a seed sample, leafsample, a flower sample, a trichome sample, a pollen sample, and asample of dried plant material including flower, pollen and trichomes,and the obtaining the sample comprises isolating genomic DNA.
 8. Themethod of claim 4, wherein the genotyping method comprises a PCR basedmethod.
 9. The method of claim 8, wherein the genotyping methodcomprises DNA amplification using forward and reverse primers and/orprimer extension.
 10. A method for testing a sample comprising cannabisto determine if the cannabis is hemp or marijuana, the methodcomprising: I) obtaining a test sample comprising genomic DNA; II)genotyping the test sample for a set of SNPs, the set comprising atleast 10, 20, 30, 40, 48, 50, 60, 70, 80, 90, 96 or 100 of the SNPs inTable 5, each SNP comprising a major allele and a minor allele asprovided in Table 5; III) detecting for each SNP of the set the presenceor absence of the major allele and/or the minor allele in the testsample; IV) determining the sample is hemp or marijuana according to theset of SNPs, wherein at least 10, 20, 30, or 40 of the genotyped SNPs ofthe set have an FST of greater than 0.679 as provided in Table 5, and atleast 2 of the genotyped SNPs of the set have a major allele or minorallele with an allele frequency of 0 as provided in Table 5, wherein thetest sample is determined to be hemp if the major alleles and/or minoralleles in combination when compared to the reference profiles providedin Table 5 are most similar to major alleles and/or minor alleles morecommonly found in hemp, or the test sample is identified as marijuana ifthe major alleles and/or minor alleles in combination when compared tothe reference profiles provided in Table 5 are most similar to majoralleles and/or minor alleles more commonly found in marijuana; and V)using the presence or absence of the major allele and/or the minorallele for each SNP of the set in marker assisted selection (MAS) toselect a cultivated cannabis plant, crossing the cultivated cannabisplant with a wild type cannabis plant, and selecting an offspringcannabis plant with a desired combination of major alleles and/or minoralleles.
 11. The method of claim 10, wherein the test sample is selectedfrom a genomic DNA sample, a plant sample, a seed sample, leaf sample, aflower sample, a trichome sample, a pollen sample, and a sample of driedplant material including flower, pollen and trichomes, and the obtainingthe sample comprises isolating genomic DNA.
 12. The method of claim 10,wherein the genotyping method comprises a PCR based method.
 13. Themethod of claim 12, wherein the genotyping method comprises DNAamplification using forward and reverse primers and/or primer extension.14. The method of claim 3, wherein the genotyping method comprises DNAamplification using forward and reverse primers and/or primer extension.